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(54) NOVEL DNA POLYMERASE 

(57) The present invention relates to a DNA 
polymerase possesses the properties of 1) exhibiting 
higher polymerase activity when assayed by using as a 
substrate a complex resulting from primer annealing to 
a single stranded template DNA, as compared to the 
case where an activated DNA is used as a substrate; 2) 
possessing a 3'-»5' exonuclease activity; 3) being capa- 
ble of amplifying a DNA fragment of about 20 kbp, in the 
case where polymerase chain reaction (PCR) is carried 
out using X-DNA as a template. It also relates to a DNA 
polymerase-constituting protein; a DNA containing the 
base sequence encoding thereof; and a method for pro- 
ducing the DNA polymerase. The present invention pro- 
vides a novel DNA polymerase possessing both a high 
primer extensibility and a 3'->5* exonuclease activity. 
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Description 

TECHNICAL FIELD 

s The present invention relates a DNA polymerase which is useful for a reagent for genetic engineering, a method for 

producing the same, and a gene encoding thereof. 

BACKGROUND ART 

70 DNA polymerases are useful enzymes for reagents for genetic engineering, and the DNA polymerases are widely 
used for a method for determining a base sequence of DNA, labeling, a method of site-directed mutagenesis, and the 
like. Also, thermostable DNA polymerases have recently been remarked with the development of the polymerase chain 
reaction (PCR) method, and various DNA polymerases suitable for the PCR method have been developed and com- 
mercialized. , 

75 Presently known DNA polymerases can be roughly classified into four families according to amino acid sequence 

homologies; among which family A (pol I type enzymes) and family B (a type enzymes) account for the great majority. 
Although DNA polymerases belonging to each family generally possess mutually similar biochemical properties, 
detailed comparison reveals that individual DNA polymerase enzymes differ from each other in terms of substrate spe- 
cificity, substrate analog incorporation, degree and rate for primer extension, mode of DNA synthesis, association of 

20 exonuclease activity, optimum reaction conditions of temperature, pH and the like, and sensitivity to inhibitors. Thus, 
those possessing especially suitable properties for. the respective experimental procedures have been selectively used 
of all available DNA polymerases. 

DISCLOSURE OF INVENTION > - - 

25 . ■ ■ ■ ' ... • • • 

An object of the present invention is to provide a novel DNA polymerase not belonging to any of the above families, 
and possessing biochemical properties not owned by any of the existing DNA polymerases. For example, primer exten- 
sion activity and 3'->5' exonuclease activity are considered as mutually opposite properties, and none of the existing 
DNA polymerase enzymes with strong primer extension activity possess 3'^5' exonuclease activity, which is an impor- 

30 tant proofreading function for DNA synthesis accuracy. Also, the existing DNA polymerases possessing excellent proof- 
reading functions are poor in primer extension activity. Therefore, development of a DNA polymerase possessing both 
potent primer extension activity and potent 3'-»5' exonuclease activity would significantly contribute to DNA synthesis 
in vitro. - 

Another object of the present invention is to provide a method for producing the novel DNA polymerase mentioned 

35 above. 

A still another object of the present invention is to provide a gene encodinjg the DNA polymerase of the present 
invention. 

As a result of extensive investigation, the present inventors have found genes of the novel DNA polymerase from 
hyperthermophilic arcaebacterium Pyrrococcus furious, followed by cloning of the above genes, and have clarified that 
40 two kinds of novel proteins possess a novel DNA polymerase activity exhibiting the activity under coexistence of the 
above two kinds of proteins. Furthermore, the present inventors have prepared a transfdrmant into which the above 
genes are introduced, and have succeeded in mass-producing the complex type DNA polymerase. 

Accordingly, the gist of the present invention is as follows: 

45 [1] A DNA polymerase characterized in that the DNA polymerase possesses the following properties: 

1 ) exhibiting higher polymerase activity when assayed by using as a substrate a complex resulting from primer 
annealing to a single stranded template DNA, as compared to the case where an activated DNA is used as a 
substrate; 

so 2) possessing a 3'->5' exonuclease activity; 

3) being capable of amplifying a DNA fragment of about 20 kbp. in the case where polymerase chain reaction 
(PCR) is carried out using X-DNA as a template under the following conditions: 

PCR conditions: 

55 ' 

(a) a composition of reaction mixture: containing 10 mM Tris-HCI (pH 9.2), 3.5 mM MgCI 2 , 75 mM KCI, 400 nM 
each of dATP, dCTP, dGTP and dTTP, 0.01% bovine serum albumin, 0.1% Triton X-100, 5.0 ng/50 nl k-DNA, 
10 pmole/50 *il primer XI (SEQ ID NO:8 in Sequence Listing), primer A.1 1 (SEQ ID NO:9 in Sequence Listing), 
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and 3.7 units/50 DN A polymerase; 

(b) reaction conditions: carrying out a 30-cycle PCR, wherein one cycle is defined as at 98°C for 10 seconds 
and at 68°C for 10 minutes; 

[2] The DNA polymerase according to the above item [1 ], characterized in that the DNA polymerase exhibits a lower 
error rate in DNA synthesis as compared to Taq DNA polymerase; 

[3] The DNA polymerase according to the above item [1] or [2], wherein the molecular weight as determined by gel 
filtration method is about 220 kDa or about 385 kDa; 

[4] The DNA polymerase according to any one of the above items [1] to [3], characterized in that the DNA polymer- 
ase exhibits an activity under coexistence of two kinds of DNA polymerase-constituting protein, a first DNA 
polymerase-constituting protein and a second DNA polymerase-constituting protein; 

[5] The DNA polymerase according to the above item [4], characterized in that the molecular weights of the first 
DNA polymerase-constituting protein and the second DNA polymerase-constituting protein are about 90,000 Da 
and about 1 40,000 Da as determined by SDS-PAGE, respectively; 

[6] The DNA polymerase according to the above item [4] or [5], characterized in that the first DNA polymerase-con- 
stituting protein which constitutes the DNA polymerase according to the above item [4] or [5] comprises the amino 
acid sequence as shown by SEQ ID NO:1 in Sequence Listing, or is afunctional equivalent thereof possessing sub- 
stantially the same activity which results from deletion, insertion, addition or substitution of one or more amino 
acids in the amino acid sequence; 

[7] The DNA polymerase according to the above item [4] or [5], characterized in that the second DNA polymerase- 
constituting protein which constitutes the DNA polymerase according to the above item [4] or [5] comprises the 
amino acid sequence as shown by SEQ ID NO:3 in sequence Listing, or is a functional equivalent thereof possess- 
ing substantially the same activity which results from deletion, insertion, addition or substitution of one or more 
amino acids in the amino acid sequence; 

[8] The DNA polymerase according to item [4] or [5], characterized in that the first DNA polymerase-constituting 
protein which constitutes the DNA polymerase according to the above item [4] or [5] comprises the amino acid 
sequence as shown by SEQ ID NO:1 in Sequence Listing, or is a functional equivalent thereof possessing substan- 
tially the same activity which results from deletion, insertion, addition or substitution of one or more amino acids in 
the amino acid sequence, and that the second DNA polymerase-constituting protein which constitutes the DNA 
polymerase according to the above item [4] or [5] comprises the amino acid sequence as shown by SEQ ID NO:3 
in Sequence Listing, or is a functional equivalent thereof possessing substantially the same activity which results 
from deletion, insertion, addition or substitution of one or more amino acids in the amino acid sequence; 
[9] A first DNA polymerase-constituting protein which constitutes the DNA polymerase according to the above item 
[4] or [5], wherein the first DNA polymerase-constituting protein comprises the amino acid sequence as shown by 
SEQ ID NO:1 , or an amino acid sequence resulting from deletion, insertion, addition or substitution of one or more, 
amino acids in the amino acid sequence, as a functional equivalent thereof possessing substantially the same 
activity; 

[10] A second DNA polymerase-constituting protein which constitutes the DNA polymerase according to the above 
[4] or [5], wherein the second DNA polymerase-constituting protein comprises the amino acid sequence as shown 
by SEQ ID NO:3, or an amino acid sequence resulting from deletion, insertion, addition or substitution of one or 
more amino acids in the amino acid sequence as a functional equivalent thereof possessing substantially the same 
activity; 

[1 1] A DNA containing a base sequence encoding the first DNA polymerase-constituting protein according to the 
above item [9], characterized in that the DNA comprises an entire sequence of a base sequence encoding the 
amino acid sequence as shown by SEQ ID NO:1 in Sequence Listing, or a partial sequence thereof, or that the 
DNA encodes a protein having an amino acid sequence resulting from deletion, insertion, addition or substitution 
of one or more amino acids in the amino acid sequence of SEQ ID NO:1 and possessing a function as the first DNA 
polymerase-constituting protein; 

[12] A DNA containing a base sequence encoding the first DNA polymerase-constituting protein according to the 
above items [9], characterized in that the DNA comprises an entire sequence of the base sequence as shown by 
SEQ ID NO:2 in Sequence Listing or a partial sequence thereof, or that the DNA comprises a base sequence capa- 
ble of hybridizing thereto under stringent conditions; 

[13] A DNA containing a base sequence encoding the second DNA polymerase-constituting protein according to 
the above item [10], characterized in that the DNA comprises an entire sequence of a base sequence encoding the 
amino acid sequence as shown by SEQ ID N03. or a partial sequence thereof, or that the DNA encodes a protein 
having an amino acid sequence resulting from deletion, insertion, addition or substitution of one or more amino 
acids in the amino acid sequence of SEQ ID NO:3 and possessing a function as the second DNA polymerase-con- 
stituting protein; 
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theit^^^ *" S6C0nd DNA Po'ymerase-constituting protein according to 

□ NoTin ? f NA com P nses m en * ir e sequence of the base sequence as shown by SEQ 

75 BRIEF DES CRIPTION OF DRAWING 

Figure 4 is a graph for a heat stability of the DNA polymerase of the present invention 
Figure 6 ,s an autorad.ogram for a primer extension activity of .the DNA polymerase of the present invention. 
25 BEST MODF FOR CARRYI NG OUT THF INVENTION 

(1) DNA Polymerase of Present invention and Constituting Proteins Thereof " , 

An example of the DNA polymerase of the present invention has the following properties; 

1) exhibiting higher polymerase activity when assayed bv usina as a suhstrato a ™,w,i^ u- ,"' 

2) possessing a 3'-»5' exonuclease activity; 

3) optimum pH being between 6.5 and 7.0 (in potassium phosphate buffer, at 75°CV 
4 exh.bit.ng a remaining activity of about 80% after heat treatment at 80°C for 30 minutes- 

— - — >■ 

rOH conditions: 



30 



35 



40 



50 



55 



^SS^^SH mM TriS - HCI (PH 92) ' 35 mM M9CI 2 . 75 mM KC. 400 \iM 

™ja nn , 21 ' . 3 d dTTR 0 01% bov,ne serum albumin (BSA), 0.1% Triton X-100 5 0 no/50 „\ » 

*£Z£7!ZZn °" 8 30 " CyC ' e PCR> ^ " dS,ined 35 3t 9800 ^ 10 seconds and 

6) The DNA polymerase of the present invention is superior to the Taa DNA DolvmAm^o in tor^ „♦ 

Takara Shuzo Co., Ltd.), ,n terms of primer extension properties in DNA synthesis reaction, for instance. DNA 
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strand length capable of DNA amplification by PCR method, and accuracy of DNA synthesis reaction (low error rate 
in DNA synthesis). 

The DNA polymerase of the present invention is an enzyme constituted by two kinds of proteins, wherein a molec- 
5 ular weight of the DNA polymerase of the present invention is about 220 kDa or about 385 kDa, as determined by gel 
filtration, and also shown by two bands corresponding to about 90,000 Da and about 140,000 Da on SDS-PAGE, 
respectively. The protein of about 90,000 Da (corresponding to ORF3 as described below) is herein referred to as the 
first DNA polymerase-constituting protein, and the protein of about 140,000 Da (corresponding to ORF4 as described 
below) is herein referred to as the second DNA polymerase-constituting protein. It is assumed that in the DNA polymer- 
jo ase of the present invention, the first DNA polymerase-constituting protein and the second DNA polymerase-constitut- 
ing protein are non-covalently bonded to form a complex in a molar ratio of 1:1 or 1 :2. 

The first DNA polymerase-constituting protein which constitutes the DNA polymerase of the present invention may 
comprise the amino acid sequence shown by SEQ ID NO:1 in Sequence Listing, or may be a functional equivalent pos- 
sessing substantially the same activity. Also, the second DNA polymerase-constituting protein may comprise the amino 
75 acid sequence shown by SEQ ID NO:3 in Sequence Listing, or may be a functional equivalent possessing substantially 
the same activity. 

The term "a functional equivalent" as described in the present specification is defined as follows. A protein existing 
in nature can undergo mutation, such as deletion, insertion, addition and substitution, of amino acids in an amino acid 
sequence thereof owing to modification reaction and the like of the protein itself in vivo or during purification, besides 

20 causation such as polymorphism and mutation of the genes encoding it. However, it has been known that there are 
some proteins which exhibit substantially the same physiological activities or biological activities as a protein without 
mutation. Those proteins having structural differences as described above without recognizing any significant differ- 
ences of the functions and the activities thereof, are referred to as "a functional equivalent." Here, the number of 
mutated amino acids is not particularly limited, as long as the resulting protein exhibits substantially the same physio- 

25 logical activities or biological activities as a protein without mutation. Examples thereof include one or more of muta- 
tions, for instance, one or several mutations, more specifically one to about ten mutations (such as deletion, insertion, 
addition and substitution) and the like. 

The same can be said for the resulting proteins in the case where the above mutation is artificially introduced into 
the amino acid sequence of a protein. In this case, more diverse mutants can be prepared. For example, although the 

30 methionine residue at the N-terminus of a protein expressed in Escherichia coli is reportedly often removed by the 
action of methionine aminopeptidase, since the methionine residue is not removed perfectly depending on the kinds of 
proteins, those having methionine residue and those without methionine residue can be both produced. However, the 
presence or absence of the methionine residue does not affect protein activity in most cases. It is also known that a 
polypeptide resulting from substitution of a particular cysteine residue with serine in the amino acid sequence of human 

35 interleukin 2 (IL-2) retains IL-2 activity [Science, 224, 1431 (1984)]. 

In addition, during the production of a protein by genetic engineering, the desired protein is often expressed as a 
fusion protein. For example, purification of the desired protein is facilitated by adding the N-terminal peptide chain 
derived from another protein to the N-terminus of the desired protein to increase the amount of expression of the 
desired protein, or by adding an appropriate peptide chain to the N- or C-terminus of the desired protein, expressing the 

40 protein, and using a carrier having affinity for the peptide chain added. Accordingly, a DNA polymerase having an amino 
acid sequence which has a partial difference with that of the DNA polymerase of the present invention is within the 
scope of the present invention as "a functional equivalent," as long as it exhibits substantially the same activities as the 
DNA polymerase of the present invention. 

45 (2) Gene of DNA Polymerase of Present Invention 

The DNA enooding the first DNA polymerase-constituting protein which constitutes the DNA polymerase of the 
present invention includes a DNA comprising an entire sequence of the base sequence encoding the amino acid 
sequence as shown by SEQ ID NO:1 in Sequence Listing or a partial sequence thereof including, for instance, a DNA 

so comprising an entire sequence of the base sequence as shown by SEQ ID NO:2 or a partial sequence thereof. Specif- 
ically, a DNA comprising a partial sequence of the base sequence encoding the amino acid sequence as shown by SEQ 
ID NO:1 including,. for instance, the DNA comprising a partial sequence of the base sequence as shown by SEQ ID 
NO:2 in Sequence Listing, the base sequence encoding a protein possessing a function of the first DNA polymerase- 
constituting protein is also included in the scope of the present invention. Also, in the amino acid sequence as shown 

55 by SEQ ID NO:1 , the above DNA also includes a DNA encoding a protein comprising an amino acid sequence resulting 
from deletion, insertion, addition, substitution and the like of one or several amino acids, the protein possessing a func- 
tion of the first DNA polymerase-constituting protein. Furthermore, a base sequence capable of hybridizing to the above 
base sequences under the stringent conditions, the base sequence encoding a protein possessing a function of the first 
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ZZ Jt ™ 1 POlymerase-consMuting protein which constitutes the DNA polymerase of the present Lento, 
sS ID NO 3?n C ?Zr^? n T S SeqU6nCe 01 thS b3Se Sequence encodin 9 th * amino acid sequent a^rown bj 
ff?Jn^2'i k ^ 6 LlSt ' ng ° f a Part,al Sequence thereof indlJdin 9. fo^ instance, a DNA lomprisinq "entire 
S th nLf baSe sequence as shown ^ SEQ ID. NO:4 in Sequence Ssting or a partial sequel hereof S^c f 

m S^auence i^TSTS ' «™P"s.ng a partial sequence of the base sequence as shown by SEQ ID NO 4 

the second DNA polymerase-constituting protein, is also included in the scope of the present invention P 

( J^ 6 TJT " Pr ° te,n P 06888 *^ a function of the f if st DNA polymerase-constituting protein" or "protein oossessina a 
f unct,on crt the second DNA polymerase-constituting protein" herein refers to a P«J^^5iJiS^!?S^ 
a DNA polymerase acfvity with various physicochemical properties shown in the above iteni 1) to 

TJ^l of hybridizing under the stringent conditions" refer to hybridizing to a orobe after incuhnt 

SZri^^D^^^ (BSA) - ««*** P^olidone. 0,o^ icol 400.^ 
.o»i^ e lT "'""^ containing a base sequence encoding an amino acid sequence" described in the present soecifi 
d6S,9 H nat,n9 3 Particu ' ar on the gene. Therefore, there can be a DNA 

SSSLSS' ^ many 9enerations of or9anism even ^ en a 9ene ^Z^Tl^ 

m J^ r ^ Ver ' it iS n0t diffi ° Ult to artifida,,y produce a variet V of 9 enes ^coding the same amino acid sequence bv 
means of vanous genet.c engineering techniques. For example, when a codon used in the natu aTalne eSno tS 

mS^iSS^ T ""J™" ,0W ,n thiS C3Se ' hi9h expression of *• desired protein is !SE«£iZS2£- 
■ ^ ° anrther ° ne US6d at a hi9h frequenc y in the n°st without changing the aminoac id sSnce 

encoded (for .nstance. Japanese Patent Laid-Open No. Hei 7-102146): As described above it is ofco^rseoS^ 

- 32SSSSSF~= sasassasS 

(3) Method for Producing DNA Polymerase of Present Invention 

The present inventors have found genes of a novel DNA polymerase from a hypertherrhophilic archaebactenum 

SrtT ?° n6d to C ' arify ^ the 9en6S 6nCOde a novel DNA PoSneraseSSg 2 
coexistence of two krnds of proteins on the genes. In the present invention, the DNA polymerase of the oresert Trier, 

2£I £ T^^* by Prepari " 9 » ansfo "™* incorporating the above aanea ftSSJS2 ^nsform 
8 proCeSS c ° m P risin 9 culturi "9 a transformant containing both the gene enSfng ^S DNA 
TnoZ nwt ^ 9 pr0te,n and 1,16 Qene encodin 9 the second DNA polymerase^onsttuting protein and coliect 
oil 22! ^TT 6 fr ° m ^ reSU ' ,in9 cu,hWL Alternativ «'* «he transformant may be pitpi^£ll££ 
»nn ^ T V CU tUnn9 3 transformant contai ™ng the gene encoding the first DNA pdymeras^co^stUu^g orcein 
and a transformant containing the gene encoding the second DNA polymerase-constftutfng p^STS^ DN I 

^ZTtTT^ pr t ins cor * ained in the resultin9 cu,ture - and * ol,ectino thSSJSC^TSS, 

Here, the phrase •transformant containing both the gene encoding the first DNA polymerase-constitut^o Lt«in 
and the gene encoding the second DNA polymerase-constituting protein" may be a SSS^^^^S^T 

T *"° T ession vectors containin9 the resp «* ' e 9enes - ^^^TXS!S^% 

recombmmg both genes .nto one expression vector to allow the respective proteins to be express^ * 
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(4) A cloning of the genes of the DNA polymerase of the present invention, an analysis of obtained clones, physico- 
chemical properties, activities, applicabilities to PCR method of expression product DNA polymerase, and the like are 
hereinafter described in detail. 

s The strain used for the present invention is not subject to particular limitation. Examples thereof include Pyrococcus 

furiosus DSM3638, as a strain belonging to the genus Pyrococcus. The above strain can be made available from Deut- 
. sche Sammlung von Mikroorganismen und Zellkulturen GmbH. In the case of culturing the above strain in an appropri- 
ate growth culture, preparing a crude extract from the resulting culture, and subjecting the crude extract to a 
polyacrylamide gel electrophoresis, since the present inventors found existences of several kinds of protein bands 

io showing DNA polymerase activity in the gel, it has been anticipated that the genes corresponding to these respective 
bands have existed. Specifically, the novel DNA polymerase gene and the product thereof can be cloned by the proce- 
dures exemplified below. 

1) DNA is extracted from Pyrococcus furiosus; - 
15 2) The DNA obtained in 1) is digested with an appropriate restriction endonuclease, to prepare a DNA library with 
a plasmid, cosmid and the like, as a vector; 

3) The library prepared in 2) is introduced into Escherichia coli, and a foreign gene is expressed to prepare a pro- 
tein library in which crude extracts of the resulting clones are collected; 

4) A DNA polymerase activity is assayed by using the protein library prepared in 3), and a foreign DNA is taken out 
20 from the Escherichia coli clone which provides a crude extract having an activity; 

5) The Pyrococcus furiosus DNA fragment contained in the plasmid or cosmid taken out is analyzed to narrow 
down the gene region encoding a protein exhibiting a DNA polymerase activity; 

6) The base sequence of the region in which the protein exhibiting a DNA polymerase activity is presumably 
encoded is determined to deduce the primary structure of the protein; and 

25 7) An expression plasmid is constructed to take a form which more easily allows the expression of the protein 
deduced in 6) in Escherichia coli, and the produced protein is purified and analyzed for the properties thereof. 

The above DNA donor, Pyrococcus furiosus DSM3638, is a hyperthermophilic archaebacterium, which is cultured 
at 95°C under anaerobic conditions. Known methods can be used as a method for disrupting grown cells followed by 

30 extracting and purifying DNA, a method for digesting the obtained DNA with a restriction endonuclease and for other 
methods. Such methods are described in detail by in Molecular Cloning: A Laboratory Manual, 75-178, published, by 
Cold Spring Harbor Laboratory in 1982, edited by T Maniatis et al. 

In the preparation of a DNA library, the triple helix cosmid vector (manufactured by Stratagene), for example, can 
be used. The DNA of Pyrococcus furiosus \s partially digested with Sau3M (manufactured by Takara Shuzo Co., Ltd.), 

35 and the digested DNA is subjected to density gradient centrifugation to obtain the long DNA fragments. They are ligated 
to the BamH\ site of the above vector, followed by in vitro packaging. The respective transformants obtained from the 
DNA library thus prepared are separately cultured. After harvesting, cells are disrupted by ultrasonication, and the 
resulting disruption is heat-treated to inactivate the DNA polymerase from the host Escherichia coli. Thereafter, a 
supernatant containing a thermostable protein can be obtained by centrifugation. The above supernatant is named as 

40 a cosmid protein library. By means of assaying the DNA polymerase activity using a portion of the supernatant, a clone 
that expresses the DNA polymerase derived from Pyrococcus furiosus can be obtained. DNA polymerase activity can 
be assayed using the known method described in DNA Polymerase from Escherichia coli . published by Harpar and 
Row, edited by D.R. Davis, 263-276 (authored by C.C. Richardson). 

One of the DNA polymerase genes of Pyrococcus furiosus has already been cloned and its structure clarified by 

45 the present inventors, as described in Nucleic Acids Research, 21, 259-265 (1993). The translation product of the 
above gene is a polypeptide having a molecular weight of about 90,000 Da and consisting of 775 amino acids, and the 
amino acid sequence thereof clearly contains preserved sequences of the a-type DNA polymerases. In fact, since the 
DNA polymerase activity exhibited by this gene product is inhibited by aphidicolin, which is a specific inhibitor of a-type 
DNA polymerases, the above DNA polymerase is distinguishable from the DNA polymerase of the present invention. 

so Therefore, the above known gene out of the obtained clones exhibiting thermostable DNA polymerase activity can be 
removed by a process comprising digesting the cosmid contained in each clone, carrying out hybridization with the 
above gene as a probe, and selecting an unhybridizing clone. A restriction endonuclease map of the DNA insert can be 
prepared for the cosmid digested with the resulting clone containing the novel DNA polymerase gene. Next, a location 
of the DNA polymerase gene on the above DNA fragment can be determined by a process comprising dividing the 

55 above DNA fragment into various regions on the basis of the obtained restriction endonuclease map, subcloning each 
region into a plasmid vector, introducing the resulting vector into Escherichia coli, and assaying the thermostable DNA 
polymerase activity exhibited therein. An Xba\-Xba\ DNA fragment of about 10 kbp containing the DNA polymerase 
gene can be thus obtained. 
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\™J^iV™^ m T J 3 001 harb ° rin9 3 P ' aSmid incor P°rati"9 the above DNA fragment exhibits a sufficient 
level of a DNA synthesis activity in the crude extract thereof even after treatment at 90°C for 20 minutes while such an 
2 ; M ^ m a Z P'a^^ w ithout incorporating a DNA fragment. Therefore, it can be conc^^SlE 
mation for producing a thermostable polymerase is present on the DNA fragment, and that a gene having the abwe 
ETSS ! ab ° Ve ESCheriCNa C0 ' L The P ' asmid «"""0 *«" recombination of the DNA t Igwnt 

rammed ll^^T^T b/ T8kara ShUZ ° C °- M ) iS named as PFU1001. The Escherichia co// JM109 
transformed w.th the above plasmid is named and identified as Escherichia co// JMlO9/pFUlO0l. has been deposited 

-£L"2S?™ "Tr BP " 5579 the Nati ° nal ' nStitUte 0f Bi0science ^•^Ttooto^SSS 

industrial Science and Technology, Ministry of International Trade and Industry, of which the address is 13 H?gashi " 

chome, Tsukuba-sh,, .baraki-ken. 305, Japan, since August 11. 1995 (date of original deposit) under the Budapest 

mo th T ^f sequenc ® 0f ' he DNA ,ra 9 ment inserted in the P"*™* PFU1 001 can be determined bya conventional 
Su2; P J ' 2 th ? dide ° X X me,h0d - Furthermo ^ ^gions capable of encoding a protein in the base 
sequence ,. e open read.ng frames (ORFs), can be deduced by analyzing the resulting base sequence 

An 8.450 bp sequence in the base sequence of the Xba\-Xba\ DNA fragment of about 10 kbp inserted in the olas- 

thTv^tr^ ', ? I • ° ° RF5, and ° RF6 " res P ective| y. naming from the 5' terminal side. FIG. 2 shows 

ssssss^ :r^ e r e Xbal ' Xbai T"*** ,ocation ° f the orfs - the ,ra9m - 

ORpt TETS?*"^ h ° m6,09ies t0 ^ Of known DNA polymerases was not found in any one of the above six 
found 'J 2Sf ^ted. however, that on ORF1 and ORF2. there is a sequence homologous to the CDC6 protein 
wmbe ^SSSSS£S7« " ^ 3 h0molo 9° us to *• CDCtS protein found in Schizosaccharomyces 

thesis phase (S phase) in yeasts, the proteins regulating initiation of the DNA replication. Also, the ORF6 has a 

SS^lS^^^^ ^ 10 "* DNAda ™9e repair feasts and recombSation E ^ 
somatic mitosisphase and in the meiosis phase in yeasts, and a sequence homologous to the Dmd protein a meiosis 

TlZ^ZT 09 !? RA S 51 Pr ° teia The 9Sne 6nCOdin9 the RADS1 * *° known t 0 Texprir5 

the cell cycle shift from the'Ql to S phase. For the other ORFs. namely ORF3. ORF4. and ORFS. there have been no 

known proteins found to have a homologous sequence 

It is possible to determine from which of the above ORFs the thermostable DNA polymerase activity is derived bv 

fJnl^r^T'™ 9 recombinant P' asmids inserted with the respective DNA fragments deleting various 

ZEZKErZ 3 °f ' "*! - hS P ' aSm,dS ' and aSSayinQ the the »™stable polymerase activity of each transform 

1 T J? . . ,ran 5 famiant reSUlt ' n9 ,rom transfom *tion with a recombinant plasmid inserted with a DNA fragment pr" 
pared by deleting ORF1 or ORF2, or deleting ORFS or ORF6. from the above Xbai-Xbai DNA fragment of kbo 
irraerted wittl^!i'c)N^f''^ ^^ A ^^'y m e ra je^activity 1 while those resulting from transformation with a^rec^n^inant^p/asmid ' 
mserted with a_DNA fragment prepared by deleting ORFS or ORF4 loses its activity. This fact predicts that the DNA 
polymerase activity is encoded by ORF3 or ORF4. ^H'^mai nie uiMA 

It is possible to determine by which of ORF3 and ORF4 the DNA polymerase is encoded by a process comprisinc 

ZZS T 1 P,aSmidS SeParat6ly inSerted ,he r6Spective ORFs - transforming a'hoTt SeSES 
P A«f? ' ^ aSSay,n9 exhlbition of a thermostable DNA polymerase activity in each transformant obtained 

S^XS ^ aC,iVity JS de,eC,ed 3 CrUd6 —« obtained from thelransSrS 
STE^? f ? a ' one ^ However . sin =e a similar level of a thermostable DNA polymerase activity to that in 
the transformant containing both ORFS and ORF4 can be obtained in the case where the two extracts are m xec lit £ 

JESS? """If ? f? meraSe ° f Pr6Sent inventi ° n «*« * *• transTatt^^f th^ 

^1 1*°^ . *° OUt ™ h-!h * ^ ^ pr0teinS form 3 to •«* *e DNA polymerase acti^ 

one modrf.es the other to convert ,t to an active enzyme by determining the molecular weight oHhe DNA polymerase 

ISttZ^T*!? m ° ,eCUlar W6i9ht °" the ° NA PO'y-^e by gel fiftration meSod" demon 
strate that the above two proteins form a complex. 

ORpI h ^ b - a ^ 5 t eqU ?f 6 ° f °T 3 iS Sh ° Wn by SE ° 10 NO:2 in Sequence Ustin 9- and the amino acid sequence of the 
ORFS^erived translation product, namely the first DNA polymerase-constituting protein as deduced L the base 
sequence, is shown by SEQ ID NO:1 . The base sequence of ORF4 is shown by si6 ID NO:4 in Sequence Lt^g and 
D ro\e^ ORF 4<1 erived translation product. name.y the second DNA polymerase-constLng 

protein as deduced from the base sequence, is shown by SEQ ID N0 3 

tran^rlSr A ^V™*™* ^ inventi ° n be ex P ress ed in cells by culturing a transformant resulting from 

^S!^^""^^, 1 ^ in1 ° "*** b ° th 0RF3 and ° RF4 are intr ^ed.for instance. Escherichia 
cob JM109/pFU1 001. under usual culturing conditions, for instance, culturing in an LB medium (10 g/l rvoton 5 

cultured cells to the extent that only the two kinds of bands of nearly two kinds of the DNA polymerase-constituting pro- 
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teins are obtained in SDS-polyacrylamkte gel electrophoresis (SDS-PAGE), by carrying out ultrasonication, heat treat- 
ment, and chromatography using an anionic exchange column (RESOURCE Q column, manufactured by Pharmacia), 
a heparin Sepharose column (HtTrap Heparin, manufactured by Pharmacia), a gel filtration column (Superose 6HR, 
manufactured by Pharmacia) or the like. It is also possible to obtain the desired DNA polymerase by a process compris- 

5 ing separately cufturing transformants respectively containing ORF3 or ORF4 alone as described above, and subse- 
quently mixing the cultured cells obtained, their crude extracts, or purified DNA polymerase-constituting proteins. When 
mixing the two kinds of DNA polymerase-constituting proteins, special procedures are not required, and the DNA 
polymerase possessing an activity can be obtained simply by mixing the extracts from the respective transformants or 
the two proteins purified therefrom in appropriate amounts. 

io The DNA polymerase of the present invention thus obtained provides two bands at positions corresponding to 
molecular weights of about 90,000 Da and about 140,000 Da on the SDS-PAGE, and these two bands corresponding 
to the first and second DNA polymerase-constituting proteins, respectively. 

As shown in FIG. 3, the DNA polymerase of the present invention exhibits the optimum pH is in the neighborhood 
of 6.5 to 7.0 at 75°C in a potassium phosphate buffer. When an enzyme activity of the above DNA polymerase is 

is assayed at various temperatures, the enzyme exhibits a high activity at 75° to 80°C. However, because the double 
stranded structure of the activated DNA used as a substrate for activity assay is destructed at higher temperatures, an 
accurate optimum temperature for the activity of the above enzyme has not been assayed. The above DNA polymerase 
possesses a high heat stability, retaining not less than 80% of the remaining activity even after a heat treatment at 80°C 
for 30 minutes, as shown in FIG. 4. This level of the heat stability allows the use of the above enzyme for PCR method. 

20 Also, when assessing the influence of aphidicolin, a specific inhibitor of a-type DNA polymerases, it is demonstrated 
that the activity of the above DNA polymerase is not inhibited even in the presence of 2 mM aphidicolin. 

As a result of analyzing the biochemical properties of the purified DNA polymerase, the DNA polymerase of the 
present invention possesses very excellent primer extension activity in vitro. As shown in Table 1, in the case where 
DNA polymerase activity is assayed using a substrate in a form resulting from primer annealing to a single stranded 

25 DNA (the M13-HT*Primer), higher nucleotide incorporating activity as compared to that of the activated DNA used for 
usual activity assaying (DNase l-treated calf thymus DNA) can be demonstrated. When the primer extension ability of 
the DNA polymerase of the present invention is compared with that of other DNA polymerases using the above M13- 
HT Primer substrate, the DNA polymerase of the present invention exhibits superior extension activity as compared to 
known DNA polymerases derived from Pyrococcus furiosus (Pfu DNA polymerase, manufactured by Stratagene) and 

30 Taq DNA polymerase derived from Thermus aquaticus (TaKaRa Taq, manufactured by Takara Shuzo Co., Ltd.). Fur- 
thermore, when an activated DNA is added to this reaction system as a competitor substrate, the primer extension 
activities of the above two kinds of DNA polymerases are strongly inhibited, while that of the DNA polymerase of the 
present invention is inhibited at a low level, demonstrating that the DNA polymerase of the present invention possesses 
a high affinity for substrates of the primer extension type (FIG. 6). 

35 ' 



Table 1 



Substrates 


Relative Activity 




DNA Polymerase of the 
Present Invention 


Pfu DNA Polymerase 


Tag DNA Polymerase 


activated DNA 


100 


100 


100 


thermal-denatured DNA 


340 


87 


130 


M13-HT primer 


170 


23 


.90 


M13-RNA primer 


52 


0.49 


38 


poly dA-Oligo dT 


94 


390 


290 


poly A-Oligo dT 


0.085 




0.063 



Also, the DNA polymerase of the present invention shows excellent performance when used for the PCR method. 
In the DNA polymerase derived from Thermus aquaticus. commonly used for the PCR method, it is difficult to amplify 
a DNA fragment of not less than 10 kbp using, the above DNA polymerase alone, and a DNA fragment of not less than 
55 20 kbp can be amplified when used in combination with another DNA polymerase [Proceedings of the National Acad- 
emy of Sciences of the USA, 91. 2216-2220 (1994)]. Also, the strand length of DNA amplrfiable using the Pfu DNA 
polymerase is reportedly at most about 3 kbp. By contrast, when using the DNA polymerase of the present invention, 
the amplification of a DNA fragment of 20 kbp in length is made possible even when used alone without addition of any 
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other enzymes. 

Moreover, the DNA polymerase of the present invention, which also has associated 3'->5' exonuclease activity is 
comparable to the Pfu DNA polymerase, known to ensurevery high accuracy in DNA synthesis, owing to its high activity 
!?'* ms °l 1he rat, ° ° f the exonuclease activity to the DNA polymerase activity (FIG 5). Also, the error rate during the 
DNA synthesis reaction is lower for the DNA polymerase of the present invention than that of the Taq DNA polymerase 
The various properties demonstrate that the DNA polymerase of the present invention serves very excellently as a rea-' 
gent tor genetic engineering techniques such as the PCR method. 

The finding of the novel DNA polymerase genes according to the present invention also provides an interesting 
suggestion as follows. In order to determine the manner in which the region containing the genes for ORF3 and ORF4 
encoding a novel DNA polymerase is intracellularly transcribed, the present inventors have analyzed an RNA fraction 
prepared from Pyrococcus furiosus cells by northern blotting method, RT-PCR method and primer extension method 
As a result, it is confirmed that ORF1 to ORF6 are transcribed from immediately upstream of ORF1 as a single mes- 
senger RNA (mRNA). From the above finding, there is an expectation that the production of the ORF1 and the ORF2 
in cells is subjected to the same control as that for the ORF3 and the ORF4. When considering in combination with the 
sequence homologies of ORF1, ORF2, ORF5, and ORF6 to those of CDC6 and CDC18, the CDC6 and the CDC 18 
in ^^' Ved the re 9 ulation tor initiation of the DNA replication in yeasts, the above expectation suggests that the 
novel DNA polymerase of the present invention is highly likely to be a DNA polymerase important for the DNA replica- 
tion. Since rt is also expected that the DNA replication system of archaebacteria, to which group Pyococcus furiosus 
belongs is closely related to that of eukaryotic cells, there is a possibility of the presence of an enzyme similar to the 
DNA polymerase of the present invention as a DNA polymerase important for replication that has not been found in 
eukaryotes. 

!t is also expected that thermostable DNA polymerases similar to the DNA polymerase of the present invention are 
produced in other bacteria- belonging to hyperthermophilic archaebacteria like Pyrococcus furiosus. including for 
mstancerbactena other than PyroccccuYfufiosus belonging to the genus Pyrococa/s; bacteria belonging to the genus 
Py^dictnjm, the genus Thermococcus. the genus Staphybthermus, and other genera. When these enzymes are con- 
stituted by two DNA polymerase-constituting proteins, like the DNA polymerase of the present invention, it is expected 
^ S '^ a / ? P° lvmerase a <* v *y is exhibited by combining one of the two DNA polymerase-constituting proteins 
stituting protein 3 lymerase " cor1stitutin9 protein of the P resent invention corresponding to the other DNA polymerase-con- 

The thermostable DNA polymerases similar to the DNA polymerase of the present invention, produced by the 
above hyperthermophilic archaebacteria. are expected to have homology to the DNA polymerase of the present inven- 

.on in terms of rts amino acid sequence and the base sequence of the gene encoding thereof. It is therefore possible 
to obtain the gene for a thermostable DNA polymerase similar to the DNA polymerase of the present invention of which 
the base sequence is not identical to that of the DNA polymerase of the present invention but possesses similar enzyme 
activities by a process comprising introducing into an appropriate microorganism a DNA fragment obtained from one of 
the above thermoph.hc archaebacteria by hybridization using, as a probe, a gene isolated by the present invention or a 
portion of the above base sequence, and assaying the DNA polymerase activity in a heat-treated lysate prepared in the 
same manner as the above cosmid protein library by an appropriate method. 

The above hybridization can be carried out under the following conditions. Specifically, a DNA-immobilized mem- 
brane is incubated with a probe at 50°C for 12 to 20 hours in 6 x SSC. wherein 1 x SSC indicates 0.15 M NaCI 0 015 
^nn^Tn n'loi h ^J 0 ^^ 0 5% SDS, 0.1% bovine serum albumin, 0.1% polyvinyl pyrrolidone. 0.1% Ficol 

, ?™ T t^natured salmon sperm DNA. After termination of the incubation, the membrane is washed, initiating 
at 37»C in 2 x SSC containing 0.5% SDS. and changing the SSC concentration to 0.1 x SSC from the starting level 
while varying the SSC temperature to 50°C,until the signal from the immobilized DNA becomes distinguishable from the 
Dackground. . ^ 

Thus, it is possible to obtain a gene for a thermostable DNA polymerase similar to the DNA polymerase of the 
present invention of which the DNA polymerase activity is not identical but of the same level as that of the DNA polymer- 
ase of the present invention, by introducing into an appropriate microorganism a DNA fragment obtained by a qene 
amplification reaction using, as a primer, a gene isolated by the present invention or a portion of the base sequence of 
the gene with a DNA obtained from one of the above thermophilic archaebacteria as a template, or a DNA fragment 
resulting from the thermophilic archaebacterium by hybridization with the fragment obtained by a gene amplification 
reaction as a probe, and assaying the DNA polymerase activity in the same manner as above 

The present invention is hereinafter described by means of the following examples, but the scope of the present 
invention is not limited only to those examples. The % values shown in Examples below mean % by weight 
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Example 1 

m Preparation of Pvrococcus furiosus Genomic DNA 

5 Pyrococcus furiosus DSM3638 was cultured in the following manner: 

A medium having a composition comprising 17© trypton, 0.5% yeast extract, 1% soluble starch, 3.5% Jamarin S 
Solid (Jamarin Laboratory), 0.5% Jamarin S Liquid (Jamarin Laboratory), 0.003% MgS0 4 , 0.001% NaCI, 0.0001% 
FeS0 4 -7H 2 O, 0.0001% C0SO4, 0.0001% CaCI 2 -7H 2 O t 0.0001% ZnS0 4 , 0.1 ppm CuS0 4 *5H 2 0, 0.1 ppm 

10 KAI(S0 4 ) 2 , 0.1 ppm H3BO3, 0.1 ppm Na 2 Mo0 4 *2H 2 0, and 0.25 ppm NiCI 2 -6H 2 0 was placed in a two-liter medium 
bottle and sterilized at 120°C tor 20 minutes. After removal of dissolved oxygen by sparging with nitrogen gas thereinto, 
the above strain was inoculated into the resulting medium. Thereafter, the medium was cultured by kept standing at 
95°C for 16 hours. After termination of the cultivation, cells were harvested by centrifugation. 

The harvested cells were then suspended in 4 ml of 0.05 M Tris-HCI (pH 8.0) containing 25% sucrose. To this sus- 

15 pension, 0.8 ml of lysozyme [5 mg/ml, 0.25 M Tris-HCI (pH 8.0)] and 2 ml of 0.2 M EDTA were added and incubated at 
20°C for 1 hour. After adding 24 ml of an SET solution [150 mM NaCI, 1 mM EDTA, and 20 mM Tris-HCI (pH 8.0)]; 4 ml 
of 5% SDS and 400 \i\ of proteinase K (10 mg/ml) were added to the resulting mixture. Thereafter, the resulting mixture 
was reacted at 37°C for 1 hour. After termination of the reaction, phenol-chloroform extraction and subsequent ethanol 
precipitation were carried out to prepare about 3.2 mg of genomic DNA. 

20 

(2) Preparation of Cosmid Protein Library 

Four hundred micrograms of the genomic DNA from Pyrococcus furiosus DSM3638 was partially digested with 
Sat;3A1 and fractionated by size into 35 to 50 kb fractions by density gradient ultracentrifugation method. One micro- 

25 gram of the triple helix cosmid vector (manufactured by Stratagene) was digested with Xba\, dephosphoryiated using 
an alkaline phosphatase (manufactured by Takara Shuzo Co., Ltd.). and further digested with BamHI. The resulting 
treated vector was subjected to ligation after mixing with 140 *ig of the above 35 to 50 kb DNA fractions. The genomic 
DNA fragment from Pyrococcus furiosus was packaged into lambda phage particles by in vitro packaging method using 
"GIGAPACK GOLD" (manufactured by Stratagene), to prepare a library. A portion of the obtained library was then trans- 

30 duced into E. coli DHSaMCR. Several transfbrmants out of the resulting transformants were selected to prepare a cos- 
mid DNA. After confirmation of the presence of an insert of appropriate size, about 500 transfbrmants were again 
selected from the above library, and each was separately cultured in 150 ml of an LB medium (10 g/l trypton, 5 g/l yeast 
extract, 5 g/l NaCI, pH 7.2) containing 100 ug/ml ampicillin. The resulting culture was centrrfuged, and the harvested 
cells were suspended in 1 ml of 20 mM Tris-HCI at a pH of 8.0, and the resulting suspension was then heat-treated at 

35 1 00°C for 1 0 minutes. Next, ultrasoni cation was carried out, and a heat treatment was carried out again at 1 00°C for 1 0 
minutes. The lysate obtained as a supernatant after centrifugation was used as a cosmid protein library. 

(3) Assay of DNA Polymerase Activity 

40 The DNA polymerase activity was assayed using calf thymus DNA (manufactured by Worthington) activated by 
DNase I treatment (activated DNA) as a substrate. DNA activation and assay of DNA polymerase activity were carried 
out by the method described in DNA Polymerase from Escherichia coli. 263-276 (authored by C.C. Richardson), pub- 
lished by Harper & Row, edited by D.R. Davis. 

An assay of enzyme activity was carried out by the following method. Specifically, 50 ^l of a reaction solution [20 

45 mM Tris-HCI (pH 7.7), 15 mM MgCI 2 , 2 mM 2-mercaptoethanol, 0.2 mg/ml activated DNA, 40 \M each of dATP, dCTP, 
dGTP and dTTP, 60 nM [ 3 H]-dTTP (manufactured by Amersham)], containing a sample for assaying its activity, was pre- 
pared and reacted at 75°C for 15 minutes. A 40 *il portion of this reaction mixture was then spotted onto a DE paper 
(manufactured by Whatman) and washed with 5% Na 2 HP0 4 five times. The remaining radioactivity on the DE paper 
was assayed using a liquid scintillation counter. The amount of enzyme which incorporated 10 nmol of [ 3 H]-dTMP per 

so 30 minutes into the substrate DNA, assayed by the above-described enzyme activity assay method, was defined as one 
unit of the enzyme. 

(4) Selection of Cosmid Clones Containing DNA Polymerase Gene 

55 A reaction mixture comprising 20 mM Tris-HCI (pH 7.7), 2 mM MgCI 2 , 2 mM 2-mercaptoethanol, 0.2 mg/ml acti- 
vated DNA, 40 uM each of dATP, dCTP, dGTP and dTTP, 60 nM [ 3 H]^JTTP (manufactured by Amersham) was prepared. 
One ul of 5 clones each of the respective extracts from the cosmid protein library, namely 5 *il of extracts as for one 
reaction, was added to 45 ul of this mixture. After the mixture was reacted at 75°C for 15 minutes, a 40 ul portion of 
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each reaction mixture was spotted onto a DE paper and washed with 5% Na 2 HP0 4 five times. The remaining radioac- 
tive on the DE paper was assayed using a liquid scintillation counter. A group found to have some activities by primary 
assay, wherein one group consisted of 5 clones, was separated into one clone each from the 5 clones, and then sec- 
ondary assay was carried out for each clone. Since it had been already known that the cosmid DNA library included 
clones containing a known DNA polymerase gene by a hybridization test with the gene as a probe, designated as Clone 
Nos. 57, 154, 162, and 363, 5 clones possessing DNA synthesis activity other than those clones were found as Clone 
Nos. 41, 153, 264, 462, and 491. 

(5) Preparation of Restriction Endonuclease Ma p 

Cosrhids were isolated from the above 5 clones, and each cosmid was digested with SamHI. When examining the 
resulting migration patterns, there were demonstrated several mutually common bands, predicting that those 5 clones 
recombine regions with overlaps and slight shifts. With this finding in mind, the DNA inserts in Clone Nos. 264 and 491 
were treated to prepare the restriction endonuclease map. The cosmids prepared from both clones were digested with 
various restriction endonucleases. As a result of determination for respective cleavage sites of Kpn\, Not\, Pstl, Smal, 
Xba\ t and Xho\ (all manufactured by Takara Shuzo Co., Ltd.), digested into fragments of appropriate sizes a map as 
shown in FIG. 1 was obtained. * ' , 

(6) Subclonina of DNA Polymerase Gene 

On the basis of the restriction endonuclease map as shown in FIG. 1 , various DNA fragments of about 10 kbp in 
length were cut out from the cosmid derived from clone No. 264 or 491. The fragments were then subcloned into the 
PTV118N or pTV119N vector (manufactured by Takara Shuzo Co., Ltd.). The resulting transformant with each of the 
recombinant plasmids was then subjected to assaying of the thermostable DNA polymerase activity, to demonstrate 
that a gene for production of a highly thermostable DNA polymerase was present an Xba\-Xba\ fragment of about 10 
kbp. A plasmid resulting from recombination of the Xba\-Xba\ fragment in the pTV1 1 8N vector was then named as plas- 
mid pFU1001, and the Escherichia coli JM109 transformed with the plasmid was named as Escherichia coli 
JM109/pFU1001. 

30 Example 2 

Determination of Base Sequence of DNA Fragment Containing Novel DNA Polymerase Gene 

The above Xba\-Xba\ fragment/containing the DNA polymerase gene, was again cut out from the plasmid 
PFU1001 obtained in Example 1 with Xbal, and blunt-ended using a DNA blunting kit (manufactured by Takara Shuzo 
Co., Ltd.). The resultant was then ligated to the new pTV1 18N vector, previously linearized with Smal, in different ori- 
entations to yield plasmids for preparing deletion mutants. The resulting plasmids were named as pFU1002 and 
PFU1003, respectively. Deletion mutants were sequentially prepared from both ends of the DNA insert using these 
plasmids. The Kilo-Sequence deletion kit (manufactured by Takara Shuzo Co. , Ltd.) applying Henikoff s method (Gene, 
28, 351 -359) was used for the above preparation. The 3'-overhanging type and 5'-overhanging type restriction endonu- 
cleases used were Pst\ and Xbal respectively. The base sequence of the insert was determined by the dideoxy method 
using the BcaBEST dideoxy sequencing kit (manufactured by Takara Shuzo Co. , Ltd.) with the various deletion mutants 
as templates. 

An 8,450 bp sequence in the base sequence determined is shown by SEQ ID NO:5 in Sequence Listing; As a result 
of analysis of the base sequence, there were revealed six open reading frames (ORFs) capable of encoding proteins, 
present at positions corresponding to Base Nos. 123-614 (ORF1), 61 1-1381 (ORF2), 1384-3222 (ORF3) 3225-7013 
(ORF4), 7068-7697 (ORFS), and 7711-8385 (ORF6) in the base sequence as shown by SEQ ID NO:5 in Sequence 
Listing. The restriction endonuclease map of the about 10 kbp Xba\-Xba\ DNA fragment recombined in the plasmid 
pFU1 001 and the location of the above-mentioned ORFs thereon are shown in FIG. 2. 

In addition, the thermostable DNA polymerase activity was assayed using the above various deletion mutants. The 
results demonstrated that the DNA polymerase activity is lost when the deletion involves the ORF3 and ORF4 regions, 
regardless of whether the deletion started from upstream or downstream. This finding demonstrated that the translation 
products of the ORF3 and the ORF4 were important in the exhibition of the DNA polymerase activity The base 
sequence of the ORF3 is shown by SEQ ID No:2 in Sequence Listing, and the amino acid sequence of the translation 
55 product of the ORF3 as deduced from the base sequence is SEQ ID NO:1 in Sequence Listing, respectively Also the 
base sequence of ORF4 is shown by SEQ ID NO:4 in Sequence Listing, and the amino acid sequence of the translation 
product of ORF4 as deduced from the base sequence is SEQ ID NO:3 in Sequence Listing, respectively. 
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Example 3 

Preparation of Purified DNA Polymerase Standard Preparation 

s The Escherichia co// J M109/pFU 1001 obtained in Example 1 was cultured in 500 ml of an LB medium (10 g/l tryp- 

ton, 5 g/l yeast extract, 5 g/l NaCI, pH 7.2) containing ampicillin at a concentration of 100 ug/ml. When the culture broth 
turbidity reached 0.6 in Aeoo, an inducer, isopropyl-p-D-thiogalactoside (IPTG) was added and cultured for 16 hours. 
After harvesting, the harvested cells were suspended in 37 ml of a sonication buffer [50 mM Tris-HCI, pH 8.0, 0.2 mM 
2-mercaptoethanol, 10% glycerol, 2.4 mM PMSF (phenylmethanesulfonyl fluoride)] and applied to an ultrasonic dis- 

io rupter. Forty-two milliliters of a crude extract was recovered as a supernatant by centrif ugation at 1 2,000 rpm for 1 0 min- 
utes, which was then heat-treated at 80°C for 15 minutes. Centrifugation was again carried out at 12,000 rpm for 10 
minutes to yield 33 ml of a heat-treated enzyme solution. The above solution was then dialyzed with 800 ml of buffer A 
[50 mM potassium phosphate, pH 6.5, 2 mM 2-mercaptoethanol, 1 0% glycerol] as an external dialysis liquid for 2 hours 
x 4. After dialysis, 32 ml of the enzyme solution was applied to a RESOURCE Q column (manufactured by Pharmacia) 

is which was previously equilibrated with buffer A, and subjected to chromatography using an FPLC system (manufac- 
tured by Pharmacia). A development of chromatogram was carried out on a linear concentration gradient from 0 to 500 
mM NaCI. A fraction having a DNA polymerase activity was eluted at 340 mM NaCI. 

Ten milliliters of an enzyme solution obtained by collecting as an active traction was desalted and concentrated by 
ultrafiltration, and dissolved in buffer A + 150 mM NaCI to yield 3.5 ml of an enzyme solution. The resulting enzyme solu- 

20 tion was then applied to a Hi Trap Heparin column (manufactured by Pharmacia), previously equilibrated with the same 
buffer. A chromatogram was developed on a linear concentration gradient from 150 to 650 mM NaCI using an FPLC 
system, to yield an active fraction eluted at 400 mM NaCI. Five milliliters of this fraction was concentrated to 120 nl of 
a solution including 50. mM potassium phosphate, pH 6.5, 2 mM 2-mercaptoethanol, and 75 mM NaCI by repeating 
desalting and concentration using ultrafiltration. The resulting concentrated solution was then applied to a gel filtration 

25 column of Superose 6 (manufactured by Pharmacia), previously equilibrated with the same buffer, and eluted with the 
same buffer. As a result, a fraction having a DNA polymerase activity was eluted at positions corresponding to retention 
times of 34.7 minutes and 38.3 minutes. It is suggested from the results of comparison with the elution position of 
molecular weight markers under the same conditions that these activity peaks have molecular weights of about 385 kDa 
and about 220 kDa, respectively. These molecular weights corresponded to a complex formed by the translation prod- 

30 uct of ORF3 and the translation product of ORF4 in a molar ratio of 1 :2 and another complex formed by the above trans- 
lation products in a molar ratio of 1 :1 , respectively. For the former peak, however, since a possibility that a complex is 
formed by the two translation products in a 2:2 molar ratio cannot be negated, the molecular weight determination error 
increases as the molecular weight increases. 

35 Exa m ple 4 

(1) Biochemical Properties of DNA Polymerase 

For a DNA polymerase preparation forming a complex of the translation products of ORF3 and ORF4 obtained in 
40 Example 3, namely the first DNA polymerase-constituting protein and the second DNA polymerase-constituting protein 
in a ratio at 1:1, optimum MgCI 2 and KCI concentrations were firstly assayed. The DNA polymerase activity was 
assayed in a reaction system containing 20 mM Tris-HCI, pH 7.7, 2 mM 2-mercaptoethanol, 0.2 mg/ml activated DNA, 
and 40 ^M each of dATP, dGTP, dCTP and dTTP in the presence of 2 mM MgCI 2 , while the KCI concentration was step 
by step increased from 0 to 200 mM KCI for each 20 mM increment. As a result, the maximum activity was exhibited at 
45 a KCI concentration of 60 mM. Next, the DNA polymerase activity was assayed in the same reaction system but in the 
presence of 60 mM KCI in this time, while the MgCI 2 concentration was step by step increased from 0.5 to 25 mM MgCI 2 
for each 2.5 mM increment, to compare at each concentration. In this case, the maximum activity was exhibited at an 
MgCI 2 concentration of 1 0 mM, and alternatively, in the absence of KCI, the maximum activity was exhibited at an MgCI 2 
concentration of 17.5 mM. 

so The optimum pH was then assayed. The DNA polymerase activity was assayed at 75°C by using potassium phos- 
phate buffers at various pH levels, and preparing a reaction mixture comprising 20 mM potassium phosphate, 15 mM 
MgCI 2 , 2 mM 2-mercaptoethanol, 0.2 mg/ml activated DNA, 40 uM each of dATP, dCTP, dGTP and dTTP, and 60 nM 
[ 3 H]-dTTP The results are shown in FIG. 3, wherein the abscissa indicates the pH, and the ordinate indicates the radi- 
oactivity incorporated in high-molecular DNA. As shown in the figure, the DNA polymerase of the present invention 

55 exhibited the maximum activity at a pH of 6.5 to 7.0. When Tris-HCI was used in place of potassium phosphate, the 
activity increased with alkalinity, and the maximum activity was exhibited at a pH of 8.02, the highest pH level used in 
the assay. 

The heat stability of the DNA polymerase of the present invention was assayed as follows: The purified DNA 
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75 (2) Primer Extension Rpartinn 



20 
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30 
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sequence as shtfn by SEQ lTK«7S2i2LrT-'? ,nQ * s ^ hetic d ^"boo«gonuc.eotide of the 

thetic ribooligonucleotide of the sequence BdZhWinm^o prepared by anneal.ng a 1 7-base syn- 
ptegesinalf-^^ 

ag/vatovs (TaKaRa Taq manufactured bv Tatars S 1 ^ DNA derived from TOer/nus 
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DNA. 



(31 Presence or Absence of Associated Exonuclease Activity 

5 The exonuclease activity of the DNA polymerase of the present invention was assessed as follows: As a substrate 

for 5'-»3' exonuclease activity detection, a DNA fragment labeled with at the 5'-end was prepared by a process 
comprising digesting a pUCH9 vector (manufactured by Takara Shuzo Co., Ltd.) with Ssp\ (manufactured by Takara 
Shuzo Co., Ltd.), separating the resulting 386 bp DNA fragment by agarose gel electrophoresis, purifying the fragment, 
and labeling with [y- 32 P]-ATP and polynucleotide kinase. Also, as a substrate for 3->5' exonuclease activity detection, 

10 a DNA fragment labeled with 32 P at 3'-end was prepared by a process comprising digesting a pUC1 19 vector with 
Sau3AI, separating the resulting 341 bp DNA fragment by agarose gel electrophoresis, purifying the fragment, and car- 
rying out a fill-in reaction using [y- 32 P]-CTP (manufactured by Amersham) and the Klenow fragment (manufactured by 
Takara Shuzo Co., Ltd.)- The labeled DNAs were purified by gel filtration with NICK COLUMN (manufactured by Phar- 
macia) and used in the subsequent reaction. To a reaction solution [20 mM Tris-HCI (pH 7.7), 15 mM MgCI 2 , 2 mM 2- 

15 mercaptoethanol] containing 1 ng of these labeled DNAs, 0.015 units of DNA polymerase was added, and the resulting 
mixture was reacted at 75°C for 2.5, 5 ( and 7.5 minutes. The DNAs were precipitated by adding ethanol. The radioac- 
tivity existing in the supernatant was assayed using a liquid scintillation counter, and the amount of degradation by the 
exonuclease activity was calculated. The DNA polymerase of the present invention was shown to possess potent 3'— >5' 
exonuclease activity, while no S'-^ exonuclease activity was observed. The 3'->5' exonuclease activity of the Pfu DNA 

20 polymerase, known to possess potent 3'— >5* exonuclease activity, was also assayed in the same manner as above. The 
results are together shown in FIG. 5. 

In the figure, the abscissa indicates the reaction time, and the ordinate indicates the ratio of radioactivity released 
into the supernatant relative to the radioactivity contained in the entire reaction mixture. Also, the open circles indicate 
the results for the DNA polymerase of the present invention, and the solid circles indicate those for the Pfu DNA 

25 polymerase. As shown in the figure, the DNA polymerase of the present invention showed potent 3'->5' exonuclease 
activity of the same level as that of the Pfu DNA polymerase, known to possess high accuracy of DNA synthesis owing 
to high 3'-»5' exonuclease activity. 

(4) Comparison of Accuracy of DNA Synthesis Reaction 

30 ' ■ • . 

The accuracy of DNA synthesis reaction by DNA polymerases was examined using a pUC1 18 vector (manufac- 
tured by Takara Shuzo Co.. Ltd.), partially made single stranded (gapped duplex plasmid, as a template. The single 
stranded pUC1 18 vector was prepared by the method described in Molecular Cloning: A Laboratory Manual, 2nd ed., 
4.44-4.48, published by Cold Spring Harbor Laboratory in 1989, edited by T Maniatis et al., using a helper phage 
35 M13K07 (manufactured by Takara Shuzo Co., Ltd.) with Escherichia coli MV1 184 (manufactured by Takara Shuzo Co., 
Ltd.) as a host. The double stranded DNA was prepared by digesting the pUC1 18 vector with Pvull (manufactured by 
Takara Shuzo Co., Ltd.), subjecting the digested vector to agarose gel electrophoresis, and recovering a DNA fragment 
of about 2.8 kbp. 

One microgram of the above single stranded DNA and 2 ^ig of the double stranded DNA were mixed to make 180 
40 u l of a mixture with sterile distilled water, and the solution was then incubated at 70°C for 10 minutes. Thereafter, twenty 
microliters of 20 x SSC was added to the resulting mixture, and the mixture was further kept standing at 60°C for 10 
minutes. The DNA was recovered by subjecting to ethanol precipitation. A portion thereof was subjected to agarose gel 
electrophoresis, and it was confirmed that a gapped duplex plasmid was obtained. Thirty microliters of a reaction mix- 
ture [10 mM Tris-HCI. pH 8.5 ; 50 mM KCI, 10 mM MgCI 2 , 1 mM each of dATP, dCTP, dGTP and dTTP], containing an 
45 amount one-tenth that of the resulting gapped duplex plasmid was incubated at 70°C for 3 minutes, after which 0.5 units 
of DNA polymerase was added thereto, and a DNA synthesis reaction was carried out at 70°C for 10 minutes. After ter- 
mination of the reaction, Escherichia coli DH5a (manufactured by BRL) was transformed using 10 ul of the reaction 
mixture. The resulting transformant was cultured at 37°C for 18 hours on an LB plate containing 100 ug/ml ampicillin, 
0.1 mM IPTG, and 40 ug/ml 5-bromo-4-chloro-3-indolyl-p-D-galactoside. The white or blue colonies formed on the plate 
so were counted, and the formation rate of white colonies which were resulted from a DNA synthesis error was calculated. 
As a result, the white colony formation rate (%) was 3.18% when the Taq DNA polymerase was used as the DNA 
polymerase, in contrast to a lower formation rate of 1.61% when the DNA polymerase of the present invention was 
used. 

55 (5) Application to PCR 

In order to compare the performance of the DNA polymerase of the present invention in PCR with that of the Taq 
DNA polymerase, PCR was carried out with X-DNA as a template. The reaction mixture for the DNA polymerase of the 



15 



BNSDOCID: <EP 0870832A1_I_> 



EP 0 870 832 A1 



present invention had the following composition: 10 mM Tris-HCI (pH 9.2), 3.5 mM MgCI 2 , 75 mM KCI, 400 jiM each of 
dATP, dCTR dGTP and dTTR 0.01% bovine serum albumin (BSA), and 0.1%Triton X-100. The reaction solution for the 
Taq DNA polymerase had the following composition: 10 mM Tris-HCI (pH 8.3), 1 .5 mM MgCI 2 , 50 mM KCI, and 400 
each of dATP, dCTP, dGTP and dTTR Fifty microliters of a reaction mixture containing 5.0 ng/50 ^l ?c-DNA (manufac- 

5 tured by Takara Shuzo Co. , Ltd.), 1 0 pmol/50 \i\ each of primer XI and primer M 1 , and 3.7 units/50 uJ DNA polymerase 
was prepared. The base sequences of the primer XI and the primer X1 1 are shown by SEQ ID NO:8 and SEQ ID NO:9 
in Sequence Listing, respectively. After, a 30-cycle PCR was carried out with the above reaction mixture, wherein one 
cycle is defined at 98°C for 10 seconds and at 68°C for 10 seconds. Five microliters of the reaction mixture was sub- 
jected to agarose gel electrophoresis, and the amplified DNA fragment was confirmed by staining with ethidium bro- 

10 mide. As a result, it was demonstrated that the DNA fragment amplification was not found when the Taq DNA 
polymerase was used, in contrast to the DNA polymerase of the present invention where amplification of a DNA frag- 
ment of about 20 kbp was confirmed. 

The experiment was then carried out by changing the primer to the primer XI and the primer M0. The base 
sequence of the primer M0 is shown by SEQ ID NO:10 in Sequence Listing. Twenty-five microliters of a reaction mix- 
15 ture having a similar composition to that shown above and containing 2.5 ng of 2i-DNA, 10 pmol of the primer A. land 
the primer M0, respectively, and 3.7 units of DNA polymerase was prepared. The reaction mixture was reacted in 5 
cycles under the same reaction conditions as those described above, and 5 nl of the reaction mixture was subjected to 
agarose gel electrophoresis and stained with ethidium bromide. It was demonstrated that no specific amplification was 
observed when the Taq DNA polymerase was used, in contrast to the DNA polymerase of the present invention where 
•20 a DNA fragment of about 15 kbp was amplified. . 

Example 5 

Ml Construction of Plasmid for Expression of ORF3 Translation Product Alone ™^ 
• 25 ' ... . • . , . ■ • .. ■ 

PCR was carried out using a mutant plasmid 6-82 as a template, the mutant plasmid being prepared by deleting 
the portion immediately downstream of the ORF3 from the DNA insert in the plasmid pFU1002 described in Example 
2, wherein the ORF1 to the ORF6 were located downstream of the lac promoter on the vector and also using a primer 
M4 (manufactured by Takara Shuzo Co., Ltd) and the primer N03 whose base sequence is shown by SEQ ID:1 1 in 

30 Sequence Usting. The DNA polymerase used for the PCR was the Pfu DNA polymerase (manufactured by Stratagene), 
which possessed high accuracy of synthesis reaction. A 25-cycle reaction of 100 of a reaction mixture for PCR [20 
mM Tris-HCI, pH 8.2, 10 mM KCI, 20 mM MgCI 2 , 6 mM (NH 4 ) 2 S0 4 , 0.2 mM each of dATP, dCTR dGTP and dTTR 1% 
Triton X-100," 0.01% BSA] containing 1 ng of a template DNA, 25 pmol of each primer, and 2.5 units of the Pfu DNA 
polymerase was carried out, wherein one cycle is defined as at 94°C for 0.5 minutes, at 55°C for 0.5 minutes and at 

35 72°C for 2 minutes. The amplified DNA fragment of about 2 kbp was digested with Nco\ and Sph\ (each manufactured 
by Takara Shuzo Co;, Ltd.) and inserted into between the Nco\-Sph\ sites of the pTV1 18N vector (manufactured by 
Takara Shuzo Co., Ltd.) to prepare a plasmid pFU-ORF3. The DNA insert in the above plasmid contains ORF3 alone 
in translatable conditions. 

40 ^ Construction of Plasmid for Expre ssion of ORF4 Translation Product Alone 

PCR was carried out using a mutant plasmid 6-2 as a template, the mutant plasmid being prepared by deleting the 
portion downstream of the center portion of the ORF4 from the DNA insert in the above-described plasmid pFU 1002, 
the primer M4, and the primer NQ4 of which the base sequence is shown by SEQ ID NO: 1 2 in Sequence Usting. The 

45 reaction was carried out under the same conditions as those for Example 5-(1) described above, except that the tem- 
plate DNA was replaced with the plasmid 6-2, and the primer N03 was replaced with the primer N04. A DNA fragment 
of about 1 .6 kbp obtained by digesting the above amplified DNA fragment with Nco\ and Nhe\ (manufactured by Takara 
Shuzo Co., Ltd.), together with an about 3.3 kbp Nhei-Sal fragment, including the latter portion of ORF4, isolated from 
the above plasmid pFU1002 was inserted between the Nco\-Xho\ sites of a pET15b vector (manufactured by Novagen) 

so to prepare a plasmid pFU-ORF4. The DNA insert in the plasmid contains ORF4 alone in translatable conditions. 

(3) Reconstitute of DNA Polymerase with ORF3 and ORF4 Translation Products 

The Escherichia co/i JM109 transformed with the above-described plasmid pFU-ORF3, Escherichia coli 
55 JM109/pFU-ORF3, and the Escherichia coli HMS174 transformed with the above-described plasmid pFU-ORF4, 
Escherichia coli HMS174/pFU-ORF4, were separately cultured, and then the translation products of the two ORFs 
expressed in their cells were purified. The cultivation of the transformants and the preparation of the crude extracts were 
carried out by the methods described in Example 3. Purification of both translation products was carried out using col- 
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umns such as RESOURCE Q, HiTrap Heparin, and Superose 6, while the behaviors of the translation products on SDS- 
PAGE were monitored. It was confirmed that although neither of the ORF translation products thus purified exhibited 
the DNA polymerase activity when assayed alone, thermostable DNA polymerase activity was exhibited when they 
were mixed together. 

5 

INDUSTRIAL APPLICABILITY 

The present invention can provide a novel DNA polymerase possessing both high primer extensibility and high 
3^5' exonuclease activity. The enzyme is suitable for its use for PCR method, which is useful for a reagent for genetic 
io engineering investigation. It is also possible to produce the enzyme by genetic engineering using the genes encoding 
the DNA polymerase of the present invention. 
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SEQUENCE LISTING 

5 SEQ ID NO: 1 

SEQUENCE LENGTH: 613 
SEQUENCE TYPE: amino acid 
w STRANDEDNESS: single . 

TOPOLOGY: linear 
MOLECULAR TYPE: peptide 
SEQUENCE DESCRIPTION: 

Met Asp Glu Phe Val Lys Ser Leu Leu Lys Ala Asn Tyr Leu lie 

. ;■ 5 10 15 

: Thr Pro Ser Ala Tyr Tyr Leu Leu Arg Glu Tyr Tyr. Glu Lys Gly 

20 25 30 

Glu Phe Ser lie Val Glu Leu Val Lys Phe Ala Arg Ser Arg Glu 

35 40 45 

Ser Tyr lie lie Thr Asp Ala Leu Ala Thr Glu Phe Leu Lys Val 

50 55 60 

Lys Gly Leu Glu Pro lie Leu Pro Val Glu Thr Lys Gly Gly Phe 

65 70 75 

Val Ser Thr Gly Glu Ser Gin Lys Glu Gin Ser Tyr- Glu Glu Ser 

80 85 90 

Phe Gly Thr Lys Glu Glu lie Ser Gin Glu lie Lys Glu Gly Glu 

95 100 105 

Ser Phe lie Ser Thr Gly Ser Glu Pro Leu Glu Glu Glu Leu Asn 

110 115 120 

Ser lie Gly lie Glu Glu lie Gly Ala Asn Glu Glu Leu Val Ser 

125 130 135 

Asn Gly Asn Asp Asn Gly Gly Glu Ala lie Val Phe Asp Lys Tyr 

140 145 150 

Gly Tyr Pro Met Val Tyr Ala Pro Glu Glu lie Glu Val Glu Glu 

155 160 165 

Lys Glu Tyr Ser Lys Tyr Glu Asp Leu Thr lie Pro Met Asn Pro 

170 175 180 

Asp Phe Asn Tyr Val Glu lie Lys Glu Asp Tyr Asp Val Val Phe 



25 



30 



35 



40 



55 



18 

BNSDOCID:<EP 0870832A1_I_> 



EP 0 870 832 At 



10 



15 



185 190 , 195 

Asp Val Arg Asn Val Lys Leu Lys Pro Pro Lys Val Lys Asn Gly 

200 205 210 

Asn Gly Lys Glu Gly Glu lie lie Val Glu Ala Tyr Ala Ser Leu 

215 220 225 

Phe Arg Ser Arg Leu Lys Lys Leu Arg Lys lie Leu Arg Glu Asn 

230 235 240 

Pro Glu Leu Asp Asn Val Val Asp lie Gly Lys Leu Lys Tyr Val 

245 250 255 

Lys Glu Asp Glu Thr Val Thr lie lie Gly Leu Val Asn Ser Lys 

260 265 270 

Arg Glu Val Asn Lys Gly Leu lie Phe Glu lie Glu Asp Leu Thr 

20 275 280 285 

Gly Lys Val Lys Val Phe Leu Pro Lys Asp Ser Glu Asp Tyr Arg 

290 295 300 

25 Glu Ala Phe Lys Val Leu Pro Asp Ala Val Val Ala Phe Lys Gly 

305 310 315 

Val Tyr Ser Lys Arg Gly lie Leu Tyr Ala Asn Lys Phe Tyr Leu 

320 325 330 

Pro Asp Val Pro Leu Tyr Arg Arg Gin Lys Pro Pro Leu Glu Glu 

335 340 345 

Lys Val Tyr Ala lie Leu lie Ser Asp lie His Val Gly Ser Lys 

350 355 360 

Glu Phe Cys Glu Asn Ala Phe lie Lys Phe Leu Glu Trp Leu Asn 

365 370 375 

40 Gly Asn Val Glu Thr Lys Glu Glu Glu Glu lie Val Ser Arg Val 

380 385 390 

Lys Tyr Leu lie lie Ala Gly Asp Val Val Asp Gly Val Gly Val 

395 400 405 

Tyr Pro Gly Gin Tyr Ala Asp Leu Thr lie Pro Asp lie Phe Asp 

410 415 420 

Gin Tyr Glu Ala Leu Ala Asn Leu Leu Ser His Val Pro Lys His 

425 430 435 

lie Thr Met Phe lie Ala Pro Gly Asn His Asp Ala Ala Arg Gin 
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440 445 450 

Ala lie Pro Gin Pro Glu Phe Tyr Lys Glu Tyr Ala Lys Pro He 
455 460 455 

Tyr Lys- Leu Lys Asn Ala Val lie He Ser Asn Pro Ala Val lie 
470 475 480 

Arg Leu His Gly Arg Asp Phe Leu He Ala His Gly Arg Gly H e 
485 490 495 

Glu Asp Val Val Gly Ser Val Pro Gly Leu Thr His His Lys Pro 
500 505 510 

Gly Leu Pro Met Val Glu Leu Leu Lys Met Arg His Val Ala Pro 
515 .. 520 525 

Met Phe- Gly Gly Lys Val Pro lie Ala Pro Asp Pro Glu Asp Leu ' 

530 535 540 

Leu Val He Glu Glu Val Pro Asp Val Val His Met Gly His Val 
545 550 555 

His Val Tyr Asp Ala Val Val Tyr Arg Gly Val Gin Leu Val Asn 
" 560 '"' 565 . " 570 

Ser Ala Thr Trp Gin Ala Gin Thr Glu Phe Gin Lys Met Val Asn 
575 580 5 85 

He Val Pro Thr Pro Ala Lys Val Pro Val Val Asp lie Asp Thr 

30 590 595 600 

Ala Lys Val Val Lys Val Leu Asp Phe Ser Gly Trp Cys 
605 610 • 

35 SEQ ID NO: 2 . 

SEQUENCE LENGTH: 1839 

SEQUENCE TYPE: nucleic acid . 
STRANDEDNESS: double 
TOPOLOGY: linear 
MOLECULAR TYPE: Genomic DNA 
SEQUENCE DESCRIPTION: 

ATGGATGAAT TTGTAAAATC ACTTCTAAAA GCTAACTATC TAATAACTCC CTCTGCCTAC ' 60 
TATCTCTTGA GAGAATACTA TGAAAAAGGT GAATTCTCAA TTGTGGAGCT GGTAAAATTT 120 
GCAAGATCAA GAGAGAGCTA CATAATTACT GATGCTTTAG CAACAGAATT CCTTAAAGTT 180 
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AAAGGCCTTG AACCAATTCT TCCAGTGGAA ACAAAGGGGG GTTTTGTTTC CACTGGAGAG 240 
TCCCAAAAAG AGCAGTCTTA TGAAGAGTCT TTTGGGACTA AAGAAGAAAT TTCCCAGGAG 300 

5 ATTAAAGAAG GAGAGAGTTT TATTTCCACT GGAAGTGAAC CACTTGAAGA GG AGCTCAAT 360 

AGCATTGGAA TTGAGGAAAT TGGGGCAAAT GAAGAGTTAG TTTCTAATGG AAATGACAAT 420 
GGTGGAGAGG CAATTGTCTT TGACAAATAT GGCTATCCAA TGGTATATGC TCCAGAAGAA 480 
ATAGAGGTTG AGGAGAAGGA GTACTCGAAG TATGAAGATC TGACAATACC CATGAACCCC 540 

10 GACTTCAATT AT GTGG AAAT AAAGGAAGAT TATGATGTTG TCTTCGATGT TAGGAATGTA 600 

AAGCTGAAGC CTCCTAAGGT AAAGAACGGT AATGGGAAGG AAGGTGAAAT AATTGTTGAA 660 
GCTTATGCTT CTCTCTTCAG GAGTAGGTTG AAGAAGTTAA GGAAAATACT AAGGGAAAAT 720 
CCTGAATTGG ACAATGTTGT TGATATTGGG AAGCTGAAGT ATGTGAAGGA AGATGAAACC 780 

15 GTGACAATAA TAGGGCTTGT CAATTCCAAG AGGGAAGTGA ATAAAGGATT GATATTTGAA 840 

ATAGAAGATC TCACAGGAAA GGTTAAAGTT TTCTTGCCGA AAGATTCGGA AGATTATAGG 900 
GAGGCATTTA AGGTTCTTCC AGATGCCGTC GTCGCTTTTA AGGGGGTGTA TTCAAAGAGG 960 

20 GGAATTTTGT ACGCCAACAA GTTTTACCTT CCAGACGTTC CCCTCTATAG GAGACAAAAG 1020 

CCTCCACTGG AAGAGAAAGT TTATGCTATT CTCATAAGTG ATATACACGT CGGAAGTAAA 1080 
GAGTTCTGCG AAAATGCCTT CATAAAGTTC TTAGAGTGGC TCAATGGAAA CGTTGAAACT 1140 
AAGGAAGAGG AAGAAATCGT GAGTAGGGTT AAGTATCTAA TCATTGCAGG AGATGTTGTT 1200 

25 GATGGTGTTG GCGTTTATCC GGGCCAGTAT GCCGACTTGA CGATTCCAGA TATATTCGAC 1260 

CAGTATGAGG CCCTCGCAAA CCTTCTCTCT CACGTTCCTA AGCACATAAC ,AATGTTCATT 1320 
GCCCCAGGAA ACCACGATGC TGCTAGGCAA GCTATTCCCC AACCAGAATT CTACAAAGAG 1380 
TATGCAAAAC CTATATACAA GCTCAAGAAC GCCGTGATAA TAAGCAATCC TGCTGTAATA 1440 

30 AGACTACATG GTAGGGACTT TCTGATAGCT CATGGTAGGG GGATAGAGGA TGTCGTTGGA 1500 

AGTGTTCCTG GGTTGACCCA TCACAAGCCC GGCCTCCCAA TGGTTGAACT ATTGAAGATG 1560 
AGGCATGTAG CTCCAATGTT TGGAGGAAAG GTTCCAATAG CTCCTGATCC AGAAGATTTG 1620 
CTTGTTATAG AAGAAGTTCC TGATGTAGTT CACATGGGTC ACGTTCACGT TTACGATGCG 1680 

35 

GTAGTTTATA GGGGAGTTCA GCTGGTTAAC TCCGCCACCT GGCAGGCTCA GACCGAGTTC 1740 
CAGAAGATGG TGAACATAGT TCCAACGCCT GCAAAGGTTC CCGTTGTTGA TATTGATACT 1800 
GCAAAAGTTG TCAAGGTTTT GGACTTTAGT GGGTGGTGC 1839 

40 

SEQ ID NO: 3 

SEQUENCE LENGTH: 1263 
SEQUENCE TYPE: amino acid 
45 STRANDEDNESS: single 

TOPOLOGY: linear 

50 
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MOLECULAR TYPE: peptide 
SEQUENCE DESCRIPTION: 
5 Met Glu Leu Pro Lys Glu He Glu Glu Tyr Phe Glu Met Leu Gin 

5 10 15 

Arg Glu He Asp Lys Ala Tyr Glu He Ala Lys Lys Ala Arg Ser 

20 25 30 

Gin Gly Lys Asp Pro Ser Thr Asp Val Glu lie Pro Gin Ala Thr 

35 . 40 , 45 

Asp Met Ala Gly Arg Val Glu Ser Leu Val Gly Pro Pro Gly Val 

50 ' . 55" ' 60. 

Ala Gin Arg lie Arg Glu Leu Leu Lys Glu Tyr. Asp Lys Glu lie 

65 , 70 ' 75 

Val Ala Leu Lys lie Val Asp Glu He He Glu Gly Lys Phe Gly 

80 85 90 

AspJPhe Gly Ser. Lys -Glu Lys Tyr Ala Glu Gin Ala Val Arg Thr 
25 95 ' 100 105 

Ala Leu Ala He Leu Thr Glu Gly He Val Ser Ala Pro Leu Glu 

110 • 115 120 

Gly He Ala Asp Val Lys He Lys Arg Asn Thr Trp Ala Asp Asn 

125 130 ; 135 

Ser Glu Tyr Leu Ala Leu Tyr Tyr Ala Gly Pro He Arg Ser Ser 

140 145 . 150 

Gly Gly Thr Ala Gin Ala Leu Ser Val Leu Val Gly Asp Tyr Val 

155 160 165 

Arg Arg Lys Leu Gly Leu Asp Arg Phe Lys Pro Ser Gly Lys His 

170 175 180 

He Glu Arg Met Val Glu Glu Val Asp Leu Tyr His Arg Ala Val 

185 190 195 

Ser Arg Leu Gin Tyr His Pro Ser Pro Asp Glu Val Arg Leu Ala 

200 205 210 

Met Arg Asn He Pro He Glu He Thr Gly Glu Ala Thr Asp Asp : 

215 220 225 

Val Glu Val Ser His Arg Asp Val Glu Gly Val Glu Thr Asn Gin 

230 235 240 
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Leu Arg Gly Gly Ala lie 

245 

Lys Ala Lys Lys Leu Val 

260 

Gly Trp Glu Trp Leu Lys 

275 

Glu Glu lie Glu Glu Ser 

290 

Glu Thr Arg Val Glu Val 

305 

Glu Lys Phe Arg Ala Glu 

320 

Glu lie lie Gly Gly Arg 

335 

Gly Gly Phe Arg Leu Arg 

350 

Ala Thr Trp Ser lie Asn 

365 

Phe Leu Ala lie Gly Thr 

380 

Gly Ala Val Val Thr Pro 

395 

Lys Leu Lys Asp Gly Ser 

410 

Ala Leu Lys lie Arg Asp 

425 

Asp Ala lie He Ala Phe 

440 

Leu Leu Pro Ala Asn Tyr 

455 

Val Lys Ala Val Asn Glu 

470 

Glu Glu Asn Pro Arg Glu 

485 



Leu Val Leu 
Lys Tyr lie 
Glu Phe Val 
Glu Ser Lys 
Glu Lys Gly 
He Ala Pro 
Pro Leu Phe 
Tyr Gly Arg 
Pro Ala Thr 
Gin Met Lys 
Ala Thr Thr 
Val Val Arg 
Glu Val Glu 
Gly Asp Phe 
Val Glu Glu 
Ala Tyr Glu 
Ser Val Glu 



Ala Glu Gly 
250 

Asp Lys Met 
265 

Glu Ala Lys 
280 

Ala Glu Glu 
295 

Phe Tyr Tyr 
310 

Ser Glu Lys 
325 

Ala Gly Pro 
340 - 

Ser Arg Val 
355 

Met Val Leu 
370 

Thr Glu Arg 
385 

Ala Glu Gly 
400 

Val Asp Asp 
415 

Glu lie Leu 
430 

Val Glu Asn 
445 

Trp Trp lie 
460 

Val* Glu Leu 
475 

Glu Ala Ala 
490 



Val Leu Gin 
255 

Gly lie Asp 

270 

Glu Lys Gly 
285 

Ser Lys Val 
300 

Lys Leu Tyr 
315 

Tyr Ala Lys 
330 

Ser Glu Asn 
345 

Ser Gly Phe 
360 

Val Asp Glu 
375 

Pro Gly Lys 
390 

Pro He Val 
405 

Tyr Asn Leu 
420 

Tyr Leu Gly 
435 

Asn Gin Thr 
450 

Gin Glu Phe 
465 

Arg Pro Phe 
480 

Glu Tyr Leu 
495 
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Glu Val Asp Pro Glu Phe Leu Ala Lys Met Leu Tyr Asp Pro Leu 

500 505 510 

Arg Val Lys Pro Pro Val Glu Leu Ala He His Phe Ser Glu He 

515 520 525 

Leu Glu lie -Pro Leu His Pro Tyr Tyr Thr Leu Tyr Trp Asn Thr 

530 535 . 540 

Val Asn Pro Lys Asp Val Glu Arg Leu Trp Gly Val Leu Lys Asp 

545 550 555 

Lys Ala Thr He Glu Trp Gly Thr Phe Arg Gly lie Lys Phe Ala 
.560 565 570 

Lys Lys lie Glu He Ser Leu Asp Asp Leu Gly Ser Leu Lys Arg 

575 . 580 585 

Thr Leu Glu Leu Leu Gly Leu Pro His Thr Val Arg Glu Gly lie 

590 595 600 

Val Val Val Asp Tyr Pro Trp Ser Ala Ala Leu Leu. Thr Pro Leu 
25 605 610 615 

Gly Asn Leu Glu Trp Glu Phe Lys Ala Lys Pro Phe Tyr Thr Val 

620 625 630 

lie Asp He He Asn. Glu Asn Asn Gin He Lys Leu Arg Asp Arg 

635 640 645 

Gly lie ser Trp lie Gly Ala Arg Met Gly Arg Pro Glu Lys Ala 

650 655 660 

Lys Glu Arg Lys Met Lys Pro Pro Val Gin Val Leu Phe Pro lie 

665 670 ; 675 

Gly Leu Ala Gly Gly Ser Ser Arg Asp He, Lys Lys Ala Ala Glu 
40 680 685 690 

Glu Gly Lys He Ala Glu Val Glu lie Ala Phe Phe Lys Cys Pro 

695 700 705 

Lys, Cys -Gly His Val, Gly Pro Glu Thr*Leui Gys-Pro Glu Cys Gly 

710 715 . 720 

He Arg Lys Glu. Leu He Trp Thr Cys Pro Lys Cys Gly Ala Glu 

725 730 735 

Tyr Thr Asn Ser Gin Ala Glu Gly Tyr Ser Tyr Ser Cys Pro Lys 

740 745 75 0 
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Cys Asn Val Lys Leu Lys Pro Phe Thr Lys Arg Lys lie Lys Pro 

755 760 765 

Ser Glu Leu Leu Asn Arg Ala Met Glu Asn Val Lys Val Tyr Gly 

770 775 780 

Val Asp Lys Leu Lys Gly Val Met Gly Met Thr Ser Gly Trp Lys 

785 790 795 

lie Ala Glu Pro Leu Glu Lys Gly Leu Leu Arg Ala Lys Asn Glu 

800 805 810 

Val Tyr Val Phe Lys Asp Gly Thr lie Arg Phe Asp Ala Thr Asp 

815 820 825 

Ala Pro lie Thr His Phe Arg Pro Arg Glu lie Gly Val Ser Val 

830 835 840 

Glu Lys Leu Arg Glu Leu Gly Tyr Thr His Asp Phe Glu Gly Lys 

845 850 855 

Pro Leu Val Ser Glu Asp Gin lie Val Glu Leu Lys Pro Gin Asp 

860 865 870 

Val lie Leu Ser Lys Glu Ala Gly Lys Tyr Leu Leu Arg Val Ala 

875 880 885 

Arg Phe Val Asp Asp Leu Leu Glu Lys Phe Tyr Gly Leu Pro Arg 

890 895 900 

Phe Tyr Asn Ala Glu Lys Met Glu Asp Leu lie Gly His Leu Val 

905 910 915 

lie Gly Leu Ala Pro His* Thr Ser Ala Gly lie Val Gly Arg lie 

920 925 930 

lie Gly Phe Val Asp Ala Leu Val Gly Tyr Ala His Pro Tyr Phe 

935 940 945 

His Ala Ala Lys Arg Arg Asn Cys Asp Gly Asp Glu Asp Ser Val 

950 955 , 960 

Met Leu Leu Leu Asp Ala Leu Leu Asn Phe Ser Arg Tyr Tyr Leu 

965 970 975 

Pro Glu Lys Arg Gly Gly Lys Met Asp Ala Pro Leu Val lie Thr 

980 985 990 

Thr Arg Leu Asp Pro Arg Glu Val Asp Ser Glu Val His Asn Met 

995 1000 1005 
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Asp Val Val Arg Tyr Tyr Pro Leu Glu Phe Tyr Glu Ala Thr Tyr 
1010 1015 1020 

Glu Leu Lys Ser Pro Lys Glu Leu Val Arg Val He Glu Gly Val 
1025 1030 1035 

Glu Asp Arg Leu Gly Lys Pro Glu Met Tyr Tyr Gly lie Lys Phe 
1040 1045 1050 

Thr His Asp Thr Asp Asp He Ala Leu Gly Pro Lys Met Ser Leu 
1055 - 1060 : ; 1065 

Tyr . Lys Gin Leu Gly. Asp Met Glu Glu Lys Val Lys Arg Gin Leu 
1070 1075 1080 

Thr Leu Ala Glu Arg He Arg Ala Val Asp Gin His Tyr Val. Ala 
1085 1090 1095 

Glu Thr He Leu Asn Ser His Leu He Pro Asp Leu Arg Gly Ash 
1100 1105 1110 

Leu Arg. Ser . Phe- Thr Arg Gin Glu Phe Arg Cys Val Lys Cys Asn 
25 i HIS: 1120 1125 

Thr Lys Tyr Arg Arg Pro Pro Leu Asp Gly Lys Cys Pro Val Cys . 

H30 1135, 1140 

Gly Gly Lys He Val Leu Thr Val Ser Lys Gly Ala He Glu Lys 
1X45 . 1150 1155 

Tyr Leu Gly Thr Ala Lys Met Leu Val Ala Asn Tyr Asn Val Lys 
1160 . 1165 1170 

Pro Tyr Thr Arg Gin Arg He Cys Leu Thr Glu Lys Asp He Asp 
I 175 1180 1185 

Ser Leu Phe Glu Tyr Leu Phe Pro Glu Ala Gin Leu Thr Leu lie 
40 1190 H95 1200 

Val Asp Pro Asn Asp lie Cys Met Lys Met lie Lys Glu Arg Thr 
I 205 1210 1215 

Gly Glu Thr Val Gin Gly Gly Leu Leu Glu Asn Phe Asn " Ser. Ser 
1220 1225- 1230 

Gly Asn Asn Gly Lys Lys He Glu Lys Lys Glu Lys Lys Ala Lys 
1235 . 1240 1245 

Glu Lys Pro Lys Lys Lys Lys Val He Ser Leu Asp Asp Phe Phe 
I 250 1255 1260 
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Ser Lys Arg 

5 SEQ ID NO: 4 

SEQUENCE LENGTH: 3789 

SEQUENCE TYPE: nucleic acid 

STRANDEDNESS: double 
io TOPOLOGY: linear 

MOLECULAR TYPE: Genomic DNA 

SEQUENCE DESCRIPTION: 

ATGGAGCTTC CAAAGGAAAT TGAGGAGTAT TTTGAGATGC TTCAAAGGGA AATTGACAAA 60 

' 5 GCTTACGAGA TTGCTAAGAA GGCTAGGAGT CAGGGTAAAG ACCCCTCAAC CGATGTTGAG 120 

ATTCCCCAGG CTACAGACAT GGCTGGAAGA GTTGAGAGCT TAGTTGGCCC TCCCGGAGTT 180 
,GCTCAGAGAA TTAGGGAGCT TTTAAAAGAG TATGATAAGG AAATTGTTGC TTTAAAGATA 240 

2Q GTTGATGAGA TAATTGAGGG CAAATTTGGT GATTTTGGAA GTAAAGAGAA GTACGCTGAA 300 

CAGGCTGTAA GGACAGCCTT GGCAATATTA ACTGAGGGTA TTGTTTCTGC TCCACTTGAG 360 
GGTAT AGCTG ATGTTAAAAT CAAGCGAAAC ACCTGGGCTG ATAACTCTGA ATACCTCGCC 420 
CTTTACTATG CTGGGCCAAT TAGGAGTTCT GGTGGAACTG CTCAAGCTCT CAGTGTACTT 480 

25 GTTGGTGATT ACGTTAGGCG AAAGCTTGGC CTTGATAGGT TTAAGCCAAG TGGGAAGCAT 540 

ATAGAGAGAA TGGTTGAGGA AGTTGACCTC TATCATAGAG CTGTTTCAAG GCTTCAATAT 600 
CATCCCTCAC CTGATGAAGT GAGATTAGCA ATGAGGAATA TTCCCATAGA AATCACTGGT 660 
GAAGCCACTG ACGATGTGGA GGTTTCCCAT AGAGATGTAG AGGGAGTTGA GACAAATCAG 720 

30 CTGAGAGGAG GAGCGATCCT AGTTTTGGCG GAGGGTGTTC TCCAGAAGGC TAAAAAGCTC 780 

GTGAAATACA TTGACAAGAT GGGGATTGAT GGATGGGAGT GGCTTAAAGA GTTTGTAGAG 840 
GCTAAAGAAA AAGGTGAAGA AATCGAAGAG AGTGAAAGTA AAGCCGAGGA GTCAAAAGTT 900 
GAAACAAGGG TGGAGGTAGA GAAGGGATTC TACTACAAGC TCTATGAGAA ATTTAGGGCT 960 

35 

GAGATTGCCC CAAGCGAAAA GTATGCAAAG GAAATAATTG GTGGGAGGCC GTTATTCGCT 1020 
GGACCCTCGG AAAATGGGGG ATTTAGGCTT AGATATGGTA GAAGTAGGGT- GAGTGGATTT 1080 
GCAACATGGA GC AT A AATCC AGCAACAATG GTTTTGGTTG ACGAGTTCTT GGCCATTGGA 1140 

40 ACTCAAATGA AAACCGAGAG GCCTGGGAAA GGTGCAGTAG TGACTCCAGC AACAACCGCT 1200 

GAAGGGCCGA TTGTTAAGCT AAAGGATGGG AGTGTTGTTA GGGTTGATGA TTACAACTTG 1260 
• GCCCTCAAAA TAAGGGATGA AGTCGAAGAG ATACTTTATT TGGGAGATGC AATCATAGCC 1320 
TTTGGAGACT TTGTGGAGAA CAATCAAACT CTCCTTCCTG CAAACTATGT AGAGGAGTGG 1380 

45 TGGATCCAAG AGTTCGTAAA GGCCGTTAAT GAGGCATATG AAGTTGAGCT TAGACCCTTT 1440 

GAGGAAAATC CCAGGGAGAG CGTTGAGGAA GCAGCAGAGT ACCTTGAAGT TGACCCAGAA 1500 
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TTCTTGGCTA AGATGCTTTA CGATCCTCTA AGGGTTAAGC CTCCCGTGGA CCTAGCCATA 1560 
CACTTCTCGG AAATCCTGGA AATTCCTCTC CACCCATACT ACACCCTTTA. TTGGAATACT 1620 

5 GTAAATCCTA AAGATGTTGA AAGACTTTGG GGAGTATTAA AAGACAAGGC CACCATAGAA 1680 ' 

TGGGGCACTT TCAGAGGTAT AAAGTTTGCA AAGAAAATTG AAATTAGCCT GGACGACCTG 1740 
GGAAGTCTTA AGAGAACCCT AGAGCTCCTG GGACTTCCTC ATACGGTAAG AGAAGGGATT 1800 
. GTAGTGGTTG ATTATCCGTG GAGTGCAGCT CTTCTCACTC CATTGGGCAA TCTTGAATGG 1860 
GAGTTTAAGG CCAAGCCCTT CTACACTGTA ATAGACATCA TTAACGAGAA CAATCAGATA 1920 
AAGCTCAGGG ACAGGGGAAT AAGCTGGATA GGGGCAAGAA TGGGAAGGCC AG AGAAGGCA 1980 
AAAGAA AG AA AAATGA AGCC ACCTGTTCAA GTCCTCTTCC CAATTGGCTT GGCAGGGGGT 2040 

1S TCTAGCAGAG ATATAAAGAA GGCTGCTGAA GAGGGAAAAA TAGCTGAAGT TGAGATTGCT 2100 

TTCTTCAAGT GTCCGAAGTG TGGCCATGTA GGGCCTGAAA CTCTCTGTCC CGAGTGTGGG 2160 
ATTAGGAAAG AGTTGATATG GACATGTCCC AAGTGTGGGG CTGAATACAC CAATTCCCAG 2220 

GCTGAGGGGT ACTCGTATTC ATGTGCAAAG TGCAATGTGA AGGTAAAGCC ATTCACAAAG 2280 
20 AGGAAGATAA AGCCCTCAGA GCTCTTAAAC AGGGCCATGG AAAACGTGAA GGTTTATGGA 2340 

GTTGACAAGC TTAAGGGCGT AATGGGAATG ACTTCTGGCT GGAAGATTGC AGAGCCGCTG 2400 
GAGAAAGGTC TTTTGAGAGC AAAAAATGAA GTTTACGTCT TTAAGGATGG AACCATAAGA 2460 
TTTGATGGCA CAGATGCTCC AATAACTCAC TTTAGGCCTA GGGAGATAGG AGTTTCAGTG 2520 
GAAAAGCTGA GAGAGCTTGG CTAC ACCCAT GACTTCGAAG GGAAACCTCT GGTGAGTGAA 2580 
GACCAGATAG TTGAGCTTAA GCCCGAAGAT GTAATCCTCT CAAAGGAGGC TGGCAAGTAC 2640 
CTCTTAAGAG TGGCCAGGTT TGTTGATGAT CTTCTTGAGA AGTTCTACGG ACTTCCCAGG 2700 
TTCTACAACG CCGAAAAAAT GGAGGATTTA ATTGGTCACC TAGTGATAGG ATTGGCCCCT 2760 
CACACTTCAG CCGGAATCGT GGGGAGGATA ATAGGCTTTG TAGATGCTCT GGTTGGCTAC 2820 
GCTCACCCCT ACTTCCATGC GGCCAAGAGA AGGAACTGTG ATGGAGATGA GGATAGTGTA 2880 
ATGCTACTCC TTGATGCCCT ATTGAACTTC TCCAGATACT ACCTGCCCGA AAAAAGAGGA 2940 
35 GGAAAAATGG ACGCTCCTCT TGTCATAACC ACGAGGCTTG ATCCAAGAGA GGTGGACAGT 3000 

GAAGTGCACA ACATGGATGT CGTT AGAT AC TATCCATTAG AGTTCTATGA AGCAACTTAC 3060-, 

GAGCTTAAAT CACCAAAGGA ACTTGTGAGA GTTATAGAGG GAGTTGAAGA TAGATTAGGA 3120 
AAGCCTGAAA TGTATTACGG AATAAAGTTC ACCCACGATA CCGACGACAT AGCTCTAGGA 3180 
CCAAAGATGA GCCTCTACAA GCAGTTGGGA GATATGGAGG AGAAAGTGAA GAGGCAATTG 3240 
ACATTGGCAG AGAGAATTAG AGCTGTGGAT CAACACTATG TTGCTGAAAC AATCGTCAAC .3300 
TCCCACTTAA TTCCCGACTT GAGGGGTAAC CTAAGGAGCT TTACTAGACA AGAATTTCGC 3360 
TGTGTG A AGT GTAAGACAAA GTACAGAAGG CCGCCCTTGG ATGGAAAATG CCCAGTCTGT 3420 
GGAGGAAAGA TAGTGCTGAC AGTTAGCAAA GGAGCCATTG AAAAGTACTT GGGGACTGCC 3480 
AAGATGCTCG TAGCTAACTA CAACGTAAAG CCATATACAA GGCAGAGAAT ATGCTTGACG 3540 
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GAGAAGGATA TTGATTCACT CTTTGAGTAC TTATTCCCAG AAGCCCAGTT AACGCTCATT 3600 
GTAGATCCAA ACGACATCTG TATGAAAATG ATCAAGGAAA GAACGGGGGA AACAGTTCAA 3660 
GGAGGCCTGC TTGAGAACTT TAATTCCTCT GG AAATAATG GGAAGAAAAT AGAGAAGAAG 3720 
GAGAAAAAGG CAAAGGAAAA GCCTAAAAAG AAGAAAGTTA TAAGCTTGGA CGACTTCTTC 3780 
TCCAAACGC 3789 

SEQ ID NO: 5 

SEQUENCE LENGTH: 8450 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: double 
TOPOLOGY: linear 
MOLECULAR TYPE : Genomic DNA 
SEQUENCE DESCRIPTION: 

CATAACTAAA TTATTACATT TAGTTATATG GATGGGGGAA AAATTAACAA CATGTGTTAT - 60 
GTTTCCTCTG GAAAATTGAT CTATAATAAT CTAGGAGCAC AATTTCCAAT GGAGGGTCAT 120 
CAATGAACGA AGGTGAACAT CAAATAAAGC TTGACGAGCT ATTCGAAAAG TTGCTCCGAG 180 
CTAGGAAGAT ATTCAAAAAC AAAGATGTCC TTAGGCATAG CTATACTCCC A AGG AT CT AC 240 
CTCACAGACA TGAGCAAATA GAAACTCTCG CCCAAATTTT AGTACCAGTT CTCAGAGGAG 300 
AAACTCCATC AAACATATTC GTTTATGGGA AGACTGGAAC T GG AAAG ACT GTAACTGTAA 360 
AATTTGTAAC TGAAGAGCTG AAAAGAATAT CTGAAAAATA CAACATTCCA GTTGATGTGA 420 
TCTACATTAA TTGTGAGATT GTCGATACTC ACTATAGAGT TCTTGCTAAC ATAGTTAACT 480 
ACTTCAAAGA TGAGACTGGG ATTGAAGTTC CAATGGTAGG TTGGCCTACC GATGAAGTTT 540 
ACGCAAAGCT TAAGCAGGTT ATAGATATGA AGGAGAGGTT TGTGATAATT GTGTTGGATG 600 
AAATTGACAA GTGGTAAAGA AGAGTGGTGA TGAGGTTCTC TATTCATTAA CAAGAATAAA 660 
TACTGAACTT AAAAGGGCTA AAGTGAGTGT AATTGGTATA TCAAACGACC TTAAATTTAA 720 
AGAGTATCTA GATCCAAGAG TTCTCTCAAG TTTGAGTGAG GAAGAGGTGG TATTCCCACC 780 
CTATGATGCA AATCAGCTTA GGGATATACT GACCCAAAGA GCTGAAGAGG CCTTTTATCC 840 
TGGGGTTTTA GACGAAGGTG TGATTCCCCT CTGTGCAGCA TTAGCTGCTA GAGAGCATGG 900 
AGATGCAAGA AAGGCACTTG ACCTTCTAAG AGTTGCAGGG GAAATAGCGG AAAGAGAAGG 960 
GGCAAGTAAA GTAACTGAAA AGCATGTTTG GAAAGCCCAG GAAAAGATTG AACAGGACAT 1020 
* GATGGAGGAG GTAATAAAAA CTCTACCCCT TCAGTCAAAA GTTCTCCTCT ATGCCATAGT 1080 
TCTTTTGGAC GAAAACGGCG ATTTACCAGC AAATACTGGG GATGTTTACG CTGTTTATAG 1140 
GGAATTGTGC GAGTACATTG ACTTGGAACC TCTCACCCAA AGAAGGATAA GTGATCTAAT 1200 
TAATGAGCTT GACATGCTTG GAATAATAAA TGCAAAAGTT GTTAGTAAGG GGAGATATGG 1260 
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GAGGACAAAG GAAATAAGGC TTAACGTTAC CTCATATAAG ATAAGAAATG TGCTGAGATA 1320 
TGATTACTCT ATTCAGCCCC TCCTCACAAT TTCCCTTAAG AGTGAGCAGA GGAGGTTGAT 1380 
CTAATGGATG AATTTGTAAA ATCACTTCTA AAAGCTAACT ATCTAATAAC TCCCTCTGCC 1440 
TACTATCTCT TGAGAGAATA CTATGAAAAA GGTGAATTCT CAATTGTGGA GCTGGTAAAA 1500 
TTTGCAAGAT CAAGAGAGAG CTACATAATT ACTGATGCTT TAGCAACAGA ATTCCTTAAA 1560 
GTTAAAGGCC TTGAACCAAT TCTTCCAGTG GAAACAAAGG GGGGTTTTGT TTCCACTGGA 1620 
GAGTCCCAAA AAGAGCAGTC TTATGAAGAG TCTTTTGGGA CTAAAGAAGA AATTTCCCAG 1680 
GAGATTAAAG AAGGAGAGAG TTTTATTTCC ACTGGAAGTG AACCACTTGA AGAGGAGCTC 1740 
AATAGGATTG GAATTGAGGA AATTGGGGCA AATGAAGAGT TAGTTTCTAA TGGAAATGAC 1800 
AATGGTGGAG AGGCAATTGT CTTTGACAAA TATGGCTATC CAATGGTATA TGCTCCAGAA 1860 
GAAATAGAGG TTGAGGAGAA GGAGTACTCG AAGTATGAAG ATCTGACAAT ACCCATGAAC 1920 
■ CCCGACTTCA ATTATGTGGA A AT AA A GG A A GATTATGATG TTGTCTTCGA TGTTAGGAAT 1980 
GTAAAGCTGA AGCCTCCTAA GGTAAAGAAC GGTAATGGGA AGGAAGGTGA AATAATTGTT 2040 
GAAGCTTATG CTTCTCTCTT CAGGAGTAGG TTGAAGAAGT TAAGGAAAAT ACTAAGGGAA 2100 
AATCCTGAAT TGGACAATGT TGTTGATATT GGGAAGCTGA AGTATGTGAA GG A AG AT G A A 2160 
ACCGTGACAA TAATAGGGCT TGTCAATTCC AAGAGGGAAG TGAATAAAGG ATTGATATTT 2220 
" GAAATAGAAG ATCTCACAGG AAAGGTTAAA GTTTTCTTGC CG AAAG ATTC GG A AG AT TAT 2280 
AGGGAGGCAT TTAAGGTTCT TCCAGATGCC GTCGTCGCTT TTAAGGGGGT GTATTCAAAG 2340 
AGGGGAATTT TGTACGCCAA CAAGTTTTAC CTTCCAGACG TTCCCCTCTA TAGGAGACAA 2400 
AAGCCTCCAC TGGAAGAGAA AGTTTATGGT ATTCTCATAA GTGATATACA CGTCGGAAGT 2460 
AAAGAGTTCT GCGAAAATGC CTTCATAAAG TTCTTAGAGT GGCTGAATGG AAACGTTGAA 2520 
ACTAAGGAAG AGGAAGAAAT CGTGAGTAGG GTTAAGTATC TAATCATTGC AGGAG AT GTT 2580 
GTTGATGGTG TTGGCGTTTA TCCGGGCCAG TATGCCGACT TGACGATTCC AGATATATTC 2640 
GACCAGTATG AGGCCCTCGC AAACCTTCTC TCTCACGTTC CTAAGCACAT AACAATGTTC 2700 
35 ATTGCCCCAG GAAACCACGA TGCTGCTAGG CAAGCTATTC CCCAACCAGA ATTCTACAAA 2760 

. GAGTATGCAA AACCTATATA CAAGCTCAAG AACGCCGTGA TAATAAGCAA TCCTGCTGTA 2820 
AT A AG ACT AC ATGGTAGGGA CTTTCTGATA GCTCATGGTA GGGGG AT AGA GG ATGTCGTT 2880 
GGAAGTGTTC CTGGGTTGAC CCATCACAAG CCCGGCCTCC CAATGGTTGA ACT ATT G A AG 2940 
ATGAGGCATG TAGCTCCAAT GTTTGGAGGA AAGGTTCCAA TAGCTCCTGA TCCAGAAGAT 3000 
TTGCTTGTTA TAGAAGAAGT TCCTGATGTA GTTCACATGG GTCACGTTCA CGTTTACGAT 3060 
GCGGTAGTTT ATAGGGGAGT TCAGCTGGTT AACTCCGCCA CCTGGCAGGC TCAGACCGAG 3120 
TTCCAGAAGA TGGTGAACAT AGTTGC AACG CCTGCAAAGG TTCCCGTTGT TGATATTGAT 3180. 
AC TGC A A A AG TTGTCAAGG.T TTTGGACTTT AGTGGGTGGT GCTGATGGAG CTTCCAAAGG 3240 
AAATTGAGGA GTATTTTGAG ATGCTTCAAA GGGAAATTGA CAAAGCTTAC GAGATTGCTA 3300 
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AGAAGGCTAG GAGTCAGGGT AAAGACCCCT CAACCGATGT TGAGATTCCC CAGGCTACAG 3360 
ACATGGCTGG AAGAGTTGAG AGCTTAGTTG GCCCTCCCGG AGTTGCTCAG AGAATTAGGG 3420 

5 AGCTTTTAAA AGAGTATGAT AAGGAAATTG TTGCTTTAAA GATAGTTGAT GAGATAATTG 3480 

AGGGCAAATT TGGTGATTTT GGAAGTAAAG AGAAGTACGC TGAACAGGCT GTAAGGACAG 3540 
CCTTGGCAAT ATTAACTGAG GCTATTGTTT CTGCTCCACT TGAGGGTATA GCTGATGTTA 3600 
AAATCAAGCG AAACACCTGG GCTGATAACT CTGAATACCT CGCCCTTTAC TATGCTGGGC 3660 

10 CAATTAGGAG TTCTGGTGGA ACTGCTCAAG CTCTCAGTGT ACTTGTTGGT GATTACGTTA 3720 

GGCGAAAGCT TGGCCTTGAT AGGTTTAAGC CAAGTGGGAA GCATATAGAG AGAATGGTTG 3780 
AGGAAGTTGA CCTCTATCAT AGAGCTGTTT CAAGGCTTCA ATATCATCCC TCACCTGATG 3840 
AAGTGAGATT AGCAATGAGG AATATTCCCA TAGAAATCAC TGGTGAAGCC ACTGACGATG 3900 

15 TGGAGGTTTC CC AT AG AG AT GTAGAGGGAG TTGAGACAAA TCAGCTGAGA GGAGGAGCGA 3960 

TCCTAGTTTT GGCGGAGGGT GTTCTCCAGA AGGCTAAAAA GCTCGTGAAA TACATTGACA 4020 
AG ATGGGG AT TGATGGATGG GAGTGGCTTA AAGAGTTTGT AGAGGCTAAA GAAAAAGGTG 4080 

20 AAGAAATCGA AGAGAGTGAA AGTAAAGCCG AGGAGTCAAA AGTTGAAACA AGGGTGGAGG 4140 

TAGAGAA GGG ATTCTACTAC AAGCTCTATG AGAAATTTAG GGCTGAGATT GCCCCAAGCG 4200 
AAAAGTATGC AAAGGAAATA ATTGGTGGGA GGCCGTTATT CGCTGGACCC TCGGAAAATG 4260 
GGGGATTTAG GCTTAGATAT GGTAGAAGTA GGGTGAGTGG ATTTGCAACA TGGAGCATAA 4320 

25 ATCCAGCAAC AATGGTTTTG GTTGACGAGT TCTTGGCCAT TGGAACTCAA ATGAAAACCG 4380 

AGAGGCCTGG GAAAGGTGCA GTAGTGACTC CAGCAACAAC CGCTGAAGGG CCGATTGTTA 4440 
AGCTAAAGGA TGGGAGTGTT GTTAGGGTTG ATGATTACAA CTTGGCCCTC AAAATAAGGG 4500 
ATGAAGTCGA AGAGATACTT TATTTGGGAG ATGCAATCAT AGCCTTTGGA GACTTTGTGG 4560 

30 AGAACAATCA AACTCTCCTT CCTGCAAACT ATGTAGAGGA GTGGTGGATC CAAGAGTTCG' 4620 

TAAAGGCCGT TAATGAGGCA TATGAAGTTG AGCTTAGACC CTTTGAGGAA AATCCCAGGG 4680 
AGAGCGTTGA GG AAGC AGC A ' G AGTACCTTG AAGTTGACCC AGAATTCTTG GCTAAGATGC 4740 
TTTACGATCC TCTAAGGGTT AAGCCTCCCG TGGAGCTAGC CATACACTTC TCGGAAATCC 4800 

35 

TGGAAATTCC TCTCCACCCA TACTACACCC TTTATTGGAA TACTGTAAAT CCTAAAGATG 4860 
TTGAAAGACT TTGGGGAGTA TTAAAAGACA AGGCCACCAT AGAATGGGGC ACTTTCAGAG 4920 
GTATAAAGTT TGCAAAGAAA ATTGAAATTA GCCTGGACGA CCTGGGAAGT CTTAAGAGAA 4980 

40 CCCTAGAGCT CCTGGGACTT CCTCATACGG TAAGAGAAGG GATTGTAGTG GTTGATTATC 5040 

CGTGGAGTGC AGCTCTTCTC ACTCCATTGG GCAATCTTGA ATGGGAGTTT AAGGCCAAGC 5100 
CCTTCTACAC TGTAATAGAC ATCATTAACG AGAACAATCA GATAAAGCTC AGGGACAGGG 5160 
GAATAAGCTG GATAGGGGCA AGAATGGGAA GGCCAGAGAA GGCAAAAGAA AGAAAAATGA 5220 

45 AGCCACCTGT TCAAGTCCTC TTCCCAATTG GCTTGGCAGG GGGTTCTAGC AGAGATATAA 5280 

AGAAGGCTGC TGAAGAGGGA AAAATAGCTG AAGTTGAGAT TGCTTTCTTC AAGTGTCCGA 5340 
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AGTGTGGCCA TGTAGGGCCT GAAACTCTCT GTCCCGAGTG TGGGATTAGG AAAGAGTTGA 5400 
TATGGACATG TCCCAAGTGT GGGGCTGAAT ACACCAATTC CCAGGCTGAG GGGTACTCGT 5460 
ATTCATGTCO AAAGTGCAAT GTGAAGCTAA AGCCATTCAC AAAGAGGAAG ATAAAGCCCT 5520 
CAGAGCTCTT AAACAGGGCC ATGGAAAACG TGAAGGTTTA TGGAGTTGAC AAGCTTAAGG 5580 
GCGTAATGGG AATGACTTCT GGCTGGAAGA TTGCAGAGCC GCTGGAGAAA GGTCTTTTGA 5640 
GAGCAAAAAA TGAAGTTTAC GTCTTTAAGG ATGGAACCAT AAGATTTGAT GCCACAGATG 5700 
CTCCAATAAC TCACTTTAGG CCTAGGGAGA TAGGAGTTTC AGTGGAAAAG CTGAGAGAGC 5760 
TTGGCTACAC GCATGACTTC GAAGGGAAAC CTCTGGTGAG TGAAGACCAG ATAGTTGAGC 5820 
TTAAGCCCCA AGATGTAATC CTCTCAAAGG AGGCTGGCAA GTACCTCTTA AGAGTGGCCA 5880 
GGTTTGTTCSA-TGATCTTCTT GAGAAGTTCT ACGGACTTCC -CAGGTTCTAC AACGCCGAAA 5940 
AAATGGAGGA TTTAATTGGT- CACCTAGTGA TAGGATTGGC CCCTCAC ACT TCAGCCGGAA 6000 
TCGTGGGGAG GATAATAGGC TTTGTAGATG CTCTGGTTGG" CTACGCTCAC CCCTACTTCC 6060 
ATGCGGCCAA GAGAAGGAAC TGTGATGGAG ATGAGGATAG TGTAATGCTA CTCCTTGATG 6120 
CCCTATTGAA CTTCTCCAGA TACTACCTCC CCGAAAAAAG AGGAGGAAAAATGGACGCTC 6180 
■CTCTTGTCAT. AAGCAGGAGG CTTGATCCAA GAGAGGTGGA CAGTGAAGTG CACAACATGG 6240 
ATGTCGTTAG ATACTATCCA TTAGAGTTCT ATGAAGCAAC TTACGAGCTT AAATCACCAA 6300 
AGGAACTTGT GAGAGTTATA GAGGGAGTTG AAGATAGATT AGGAAAGCCT GAAATGTATT 6360 
ACGGAATAAA GTTCACCCAC GATACCGACG ACATAGCTCT AGGACCAAAG ATGAGCCTCT 6420 
ACAAGCAGTT GGGAGATATG GAGGAGAAAG TGAAGAGGCA ATTGACATTG GCAGAGAGAA 6480 
TTAGAGCTGT GGATCAACAC TATGTTGCTG" AAACAATCCT -CAACTCCCAC TTAATTCCCG 6540 
ACTTGAGGGG TAACCTAAGG AGCTTTACTA GACAAGAATT TGGCTGTGTG AAGTGTAACA 6600 
: CAAAGTACAG AAGGCCGCCC TTGGATGGAA AATGCCCAGT" CTGTGGAGGA AAGATAGTGC 6660 
TGACAGTTAG CAAAGGAGCC ATTGAAAAGT ACTTGGGGAC TGCCAAGATG CTGGTAGCTA 6720 
ACTACAACGT AAAGCCATAT- ACAAGGCAGA- GAATATGCTT GACGGAGAAG GATATTGATT 6780 
CACTCTTTGA GTACTTATTC CCAGAAGCCC AGTTAACGCt -CATTGTAGAT CCAAACGACA 6840 
TCTGTATGAA AATGATCAAG GAAAGAACGG- GGGAAACAGT TCAAGGAGGC CTGCTTGAGA 6900 
ACTTTAATTC CTCTGGAAAT AATGGGAAGA AAATAGAGAA GAAGGAGAAA AAGGCAAAGG .6960 
AAAAGCCTAA AAAGAAGAAA GTTATAAGCT TGGACGACTT CTTCTCCAAA CGCTGACCAC 7020 
AACTTTTAAG TTCTTTCTTG AGAATAAATT CCCAGGTGGC TTAGAGAATG AAGATTGTGT 7080 
-GGTGTGGTCA TGCCTGCTTC TTGGTGGAGG AT hGGGGG AC TAAGATACTA ATCGATCCAT 7140 
ACCCAGACGT TGATGAAGAC AGAATAGGCA AGGTCGATTA CATTCTAGTT ACCCACGAGC 7200 
, ACATGGATCA CTACGGTAAG ACCCCACTAA TAGCAAAGCT CAGTGATGCC GAGGTTATAG 7260 
GGCCG AAA AC AGTTTATCTC ATGGCAATAA GTGATGGGCT AACAAAGGTC AGAGAGATAG 7320 
AGGTGGGACA GGAAATCGAG CTGGGAGATA TTAGGGTTAG GGCATTTTTC ACAGAGCATC 7380 
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CAACAAGCCA 


GTATCCCCTG 


GGATATCTAA 


TTGAAGGAAG 


CAAAAGAGTG 


GCTCACTTGG 


7440 




GAGATACATA 


CTACAGTCCA 


GCTTTTACAG 


AGTTGAGGGG 


AAAGGTTGAT 


GTTCTTTTGG 


7500 


5 


TTCCAATAGG 


TGGGAAGTCC 


ACCGCTAGTG 


TAAGGGAGGC 


TGCGGATATA 


GTGGAGATGA 


7560 




TAAGGCCCAG 


GATAGCAGTT 


CCAATGCACT 


ATGGAACGTA 


CAGCGAGGCC 


GATCCTGAAG 


7620 




AGTTCAAGAA 


GGAGCTCCAA 


AAAAGGCGCA 


TATGGGTTTT 


AGTAAAGGAT 


CTTAAGCCCT 


7680 




ATGAGGGTTT 


TGAAATCTGA 


AGGTGTTTCA 


ATGCTAAATA 


CTGAGCTCTT 


AACCACTGGA 


7740 


10 


GTCAAGGGGT 


TAGATGAGCT 


TTTAGGTGGT 


GGAGTTGCTA 


AGGGAGTAAT 


ACTCCAAGTT 


7800 




TACGGGCCAT 


TTGCCACCGG 


GAAGACAACT 


TTTGCAATGC 


AGGTTGGATT 


ATTGAATGAG 


7860 




GGAAAAGTGG 


CTTATGTTGA 


TACTGAGGGG 


GGATTCTCCC 


CCGAAAGGTT 


AGCTCAAATG 


7920 


15 


GCAGAATCAA 


GGAACTTGGA 


TGTGGAGAAA 


GCACTTGAAA 


AGTTCGTGAT 


ATTCGAACCT 


7980 


ATGGATTTAA 


ACGAGCAAAG 


ACAGGTAATT 


GCGAGGTTGA 


AAAATATCGT 


GAATGAAAAG 


8040 




TTTTCTTTAG 


TTGTGGTCGA 


CTCCTTTACG 


GCCCATTATA 


GAGCGGAGGG 


GAGTAGAGAG 


8100 




TATGGAGAAC 


TTTCCAAGCA 


ACTCCAAGTT 


CTTCAGTGGA 


TTGCCAGAAG 


AAAAAACGTT 


8160 


20 


GCCGTTATAG 


TTGTCAATCA 


AGTTTATTAC 


GATTCAAACT 


CAGGAATTCT 


TAAACCAATA 


8220 




GCTGAGCACA 


CCCTGGGGTA 


CAAAACAAAG 


GACATCCTCC 


GCTTTGAAAG 


GCTTAGGGTT 


8280 




GGAGTGAGAA 


TTGCAGTTCT 


GGAAAGGCAT 


AGGTTTAGGC 


CAGAGGGTGG 


GATGGTATAC 


8340 




TTCAAAATAA 


CAGATAAAGG 


ATTGGAGGAT 


GTAAAAAACG 


AAG AT T AG AG 


CCTGTCGTAG 


8400 


25 


ACCTCCTGGG 


CAATCCTCAG 


CGTTGCCTTA 


, TAGAGCTTCT CACTAATAAT 


8450 



SEQ ID NO: 6 
SEQUENCE LENGTH: 45 

30 

SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
35 MOLECULAR TYPE: other nucleic acid (synthetic DNA) 

SEQUENCE DESCRIPTION: 

CCGGAACCGC CTCCCTCAGA GCCGCCACCC TCAGAACCGC CACCC 

4 ° SEQ ID NO: 7 

SEQUENCE LENGTH: 17 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: single 

45 

TOPOLOGY: linear 

MOLECULAR TYPE: other nucleic acid (synthetic RNA) 

so 
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10 



15 



SEQUENCE DESCRIPTION: ' 
GUUUUCCCAG UCACGAC 

SEQ ID NO: 8 
SEQUENCE LENGTH: 23 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: single 1 
TOPOLOGY: linear 

MOLECULAR TYPE: other nucleic acid (synthetic DNA) 
SEQUENCE DESCRIPTION: 
GATGAGTTCG TGTCCGTACA ACT 



17 



23 



SEQ ID NO: 9 
20 SEQUENCE LENGTH: 22 \ 

SEQUENCE TYPE: nucleic acid 

STRANDEDNESS: single 

TOPOLOGY:- linear - . 

25 . MOLECULAR TYPE: other nucleic acid (synthetic DNA) 
.SEQUENCE DESCRIPTION : 

ACAAAGCCAG CCGGAATATC TG 

30 SEQ ID NO: 10 

SEQUENCE LENGTH: 22 

SEQUENCE TYPE: nucleic acid 
35 . STRANDEDNESS: single 

TOPOLOGY : linear 

MOLECULAR TYPE: other nucleic acid (synthetic DNA) 
SEQUENCE DESCRIPTION: 
40 TACAATACGA TGCCCCGTTA AG 

SEQ ID NO:ll 
SEQUENCE LENGTH: 32 
45 SEQUENCE TYPE: nucleic acid 

/ STRANDEDNESS: single 



50 



22 



22 
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TOPOLOGY: linear 

MOLECULAR TYPE: other nucleic acid (synthetic DNA) 
5 SEQUENCE DESCRIPTION: 

CAGAGGAGGT TGATCCCATG GATGAATTTG TA 

SEQ ID NO: 12 
10 SEQUENCE LENGTH : 32 

SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear^ 

15 

MOLECULAR TYPE: other nucleic acid (synthetic DNA) 

SEQUENCE DESCRIPTION: 

TTTAGTGGGT GGTGCCCATG GAGCTTCCAA AG 

20 



25 Claims 

1 . A DNA polymerase characterized in that said DNA polymerase possesses the following properties: 

1 ) exhibiting higher polymerase activity when assayed by using as a substrate a complex resulting from primer 
30 annealing to a single stranded template DNA, as compared to the case where an activated DNA is used as a 

substrate; 

2) possessing a 3'-»5' exonuclease activity; 

3) being capable of amplifying a DNA fragment of about 20 kbp, in the case where polymerase chain reaction 
(PCR) is carried out using X-DNA as a template under the following conditions: 

35 PCR conditions: 

(a) a composition of reaction mixture: containing 1 0 mM Tris-HCI (pH 9.2), 3.5 mM MgCI 2 , 75 mM KCI, 400 
jiM each of dATP, dCTP, dGTP and dTTP, 0.01% bovine serum albumin, 0.1% Triton X-100, 5.0 ng/50 ^l 
A.- DNA, 10 pmole/50 \i\ primer X^ (SEQ ID NO:8 in Sequence Listing), primer XII (SEQ ID NO:9 in 

40 Sequence Listing), and 3.7 units/50 \x\ DNA polymerase; . 

(b) reaction conditions: carrying out a 30-cycle PCR, wherein one cycle is defined as at 98°C for 10 sec- 
onds and at 68°C for 10 minutes. 

2. The DNA polymerase according to claim 1 , characterized in that said DNA polymerase exhibits a lower error rate 
45 in DNA synthesis as compared to Taq DNA polymerase. 

3. The DNA polymerase according to claim 1 or 2, wherein the molecular weight as determined by gel filtration 
method is about 220 kDa or about 385 kDa. 

so 4. The DNA polymerase according to any one of claims 1 to 3, characterized in that said DNA polymerase exhibits an 
activity under coexistence of two kinds of DNA polymerase-constituting protein, a first DNA polymerase-constitut- 
ing protein and a second DNA polymerase-cxxist^ 

5. The DNA polymerase according to claim 4. characterized in that the molecular weights of said first DNA polymer- 
55 ase-constituting protein and said second DNA polymerase-constituting protein are about 90,000 Da and about 

. 140,000 Da as determined by SDS-PAGE, respectively. 

6. The DNA polymerase according to claim 4 or 5, characterized in that said first DNA polymerase-constituting protein 
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7. 



10 
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^n h ,™^ Ut< *o the ° NA P° ,ymerase accOKlin 9 to claim 4 or 5 comprises the amino acid sequence as shown by 
rlt V T Se ^ u r, Ce LlSt ' n9, ° r is 3 ,unctional equivalent thereof possessing substantially the same activity 
whKh results from delefon, inS ert,on. addition or substitution of one or more amino acids in said amino acS 

SeCjLienCe. 

The DNA polymerase according to claim 4 or 5, characterized in that said second DNA DOlvmerase-constitutinn 

^sszssg? DNA p ? ,ymerase accordin9 to c,aim 4 ° r 5 comprises a^r s s;: "e 2 

shown by SEQ ID NO:3 in Sequence l_.st.ng, or is a functional equivalent thereof possessing substantially the same 
act,v.ty wh,ch results from deletion, insertion, addition or substitution of one or more amino acids SJutaSSZ 

S6QU6nC6. 

8. The DNA polymerase according to claim 4 or 5, characterized in that said first DNA polymerase-constituting protein 
^h const, utes ,the DNA polymerase according to claim 4 or 5 comprises the amino acid sequerTas showr^ Z 
whirh rHV T Se ^ U r. Ce LlSt,n9, ° r iS 3 functional equivalent thereof possessing substantially the same activity 
whrch results from deletion. ,nsert,on. addition or substitution of one or more amino acids in said amino acid 
^ 1 S6COnd DNA P o| y m e^e-constituting protein which constitutes the DNA polymerase 
according to claim 4 or 5 comprises the amino acid sequence as shown by SEQ ID NO:3 in Sequence Listing or 
|s a funct-onal equivalent thereof possessing substantially the same activity which results from deletion, insertion 
addition or substitution of one or more amino acids in said amino acid sequence. 

9 " tZl^^ m ^ 6 'T Stmi ° 9 Pr6tein Whi ° h constitutes 'he DNA polymerase according to claim 4 or 5. 
whe e.n said first DNA polymerase-constituting protein comprises the amino acid sequence as shown by SEQ ID 
^ ' ° r an J amin °— ^ ^ e P?e resulting, from deletion, insertion, addition or substitution of one or more amino 
acids in said ammo acid sequence as a functional equivalent thereof possessing substantially the same activity. 

1 °" ^TJ i ^^r' C ? nSiminS Pr0tSin WhiCh constitutes ^e DNA polymerase according to claim 4 or 5. 
wherem said second DNA polymerase-constituling protein comprises the amino acid sequence as shown by SEQ 

1.1 °L a " am ' n ° ^ sequence resultin 9 from de| etion. insertion, addition or substitution of one or more amino 
acids in said ammo acid sequence as a functional equivalent thereof possessing substantially the same activity.. 

11 " i^S^^H 9 ^ 9 enC<Xiin9 ^ firSt ° NA P 0, y merase - co ^tituting protein according to claim 9. 
characterized .n that said DNA comprises an entire sequence of a base sequence encoding the amino acid 

a protein having an ammo acid sequence resulting from deletion, insertion, addition or substitution of one or more 
^Sg^n amin0 aCid SeqUenCe ° f SEQ ' D NO ' 1 V**^ a function « the first DNA polymerase- . 

12 ' 8 . ^n^ 6 " 06 enCOding ,h6 firSt DNA PO'ymerase-constituting protein according to claim 9 

characterized in that said DNA comprises an entire sequence of the base sequence as shown by SEQ ID N02 in 

fXI 3 P3 ? al S ? U6nCe there ° f ' ° r ** Said DNA <*mpri8es a base sequence capable of hybridiz- 
ing thereto under stringent conditions. 

1 3 " lOch^S^S in T? S ^nMA e enCOdiri9 thS S6C0nd DNA Po'y^rase-constituting protein according to claim 
10. characterized in that said DNA comprises an entire sequence of a base sequence encoding the amino acid 

«m ^ U8nCe ° m de,e,i ° n ' inSerti ° n ' additi ° n w substit "tion of one or more amino acids in the 

amino acid sequence of SEQ ID NO:3 and possessing a function as the second DNA polymerase-constituting pro- 

1 0 character zed in that said DNA comprises an entire sequence of the base sequence as shown by SEQ ID NO-4 

£™T 6n ? e T 9 ? f 8 Partia ' sequence the^of, or that said DNA comprises a base sequence capable of hybrid- 
izing thereto under stringent conditions, eu iiyuria 

55 15. A method for producing a DNA polymerase, characterized in that the method comprises culturing a transformant 
^VZ^, the ° NA PO'^erase-constituting protein according to claim 1 1 or 1 2 and g^e 
encoding the second DNA polymerase-constituting protein according to claim 13 or 14. and collecting the DNA 
polymerase from the resulting culture. ^-<<uis me uiw 



so 
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16. A method for producing a DNA polymerase, characterized in that the method comprises culturing a transformant 
containing gene encoding the first DNA polymerase-constituting protein according to claim 1 1 or" 12, and ia trans- 
formant containing gene encoding the second DNA polymerase-constituting protein according to claim 13 or 14, 
separately; mixing DNA polymerase-constituting proteins contained in the resulting culture; and collecting the DNA 
5 polymerase. 
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